Xingrun-Xing / SpikeLM

This is the implentation of our paper "SpikeLM: Towards General Spike-Driven Language Modeling via Elastic Bi-Spiking Mechanisms" in ICML 2024.
15 stars 0 forks source link

Is grad_scale necessary ? #3

Open ghost opened 1 month ago

ghost commented 1 month ago

Appreciate your work!

I have some questions about the code in spiking.py

(1) Is grad_scale necessary? Because I found that if I eliminated grad_scale, my own model converged faster (without using AlphaInit).

(2) Is the ElasticBiSpiking method suitable for CV tasks such as classification and object detection?

(3) Should I always use ElasticBiSpiking with AlphaInit? Judging from your paper, I think the answer is no. If I want to use ElasticBiSpiking method without AlphaInit , should I remove grad_scale ? Because I guess grad_scale makes the gradient too small.

Thanks again for your work! This is my first post on GitHub.

Xingrun-Xing2 commented 4 weeks ago

Hi, sorry for the late reply.

(1) The alpha_init is to confirm the sparsity of SNNs. It may result in a high spike firing rate if you drop it.

(2) Yes, it is a general spike encoding method.

(3) You can refer to Section "4.2. Spike Frequency Encoding" in the paper, this is the implementation of how to calculate the alpha.