fangwei123456 / spikingjelly

SpikingJelly is an open-source deep learning framework for Spiking Neural Network (SNN) based on PyTorch.
https://spikingjelly.readthedocs.io
Other
1.37k stars 239 forks source link

MultiStepLIFNode等神经元的阈值是可训练的吗?或者有什么方法可以把阈值变成一个可训练的参数吗 #371

Closed yazhuoguo closed 1 year ago

fangwei123456 commented 1 year ago

pytorch版本的神经元比较容易实现,用nn.Parameter包装一下就行。CUPY版本的非常麻烦,需要手写CUDA内核。

yazhuoguo commented 1 year ago

cupy版本的神经元可以通过把backend改为torch来实现吗?然后再用nn.parameter来包装

fangwei123456 commented 1 year ago

可以

yazhuoguo commented 1 year ago

nn.parameter的输入必须是tensor,但阈值是float,我包装了一下,但是发现在训练过程中阈值并没有被训练,代码如下: self.v_th = nn.Parameter(torch.tensor([v_th])) MultiStepLIFNode(tau=2.0, v_threshold=self.v_th.item(), detach_reset=True, backend='cupy') with open("npresult", "a") as f: np.savetxt(f,np.array(self.v_th.cpu().detach()),fmt='%.04f') f.close() 请问这个哪里有什么问题吗?另外就是您说cupy版本的神经元要手写cuda核,我这个地方也没有修改但还是能train,是这个导致阈值没有被训练吗?

fangwei123456 commented 1 year ago

改成 MultiStepLIFNode(tau=2.0, v_threshold=self.v_th, detach_reset=True, backend='cupy')试一下

我这个地方也没有修改但还是能train,是这个导致阈值没有被训练吗? 不是,因为你用的torch后端,没有使用cupy

yazhuoguo commented 1 year ago

这样改会报错 assert isinstance(v_threshold, float)
AssertionError 因为现在self.v_th是个tensor,但神经元需要的是一个float型的阈值

fangwei123456 commented 1 year ago

那就需要你继承一下MultiStepLIFNode,改一下init函数

yazhuoguo commented 1 year ago
class MultiStepLIFNode2(MultiStepLIFNode):
    def __init__(self, tau: float = 2., decay_input: bool = True, v_threshold: float = 1.,
                 v_reset: float = 0., surrogate_function: Callable = surrogate.Sigmoid(),
                 detach_reset: bool = False, backend='torch', lava_s_cale=1 << 6):
        super().__init__(tau, decay_input, v_threshold, v_reset, surrogate_function, detach_reset, backend, lava_s_cale)
        self.v_threshold = nn.Parameter(torch.tensor([v_threshold]))
        ‘’‘
        with open("/root/cifar10/npresult.txt", "a") as f:
            np.savetxt(f,np.array([(self.v_threshold).cpu().detach()]),fmt='%.04f')
        f.close()
        ’‘’

这是我继承了MultiStepLIFNode类,并修改了init函数得到的MultiStepLIFNode2;

(attn_lif): MultiStepLIFNode2(
          v_threshold=Parameter containing:
          tensor([1.], requires_grad=True), v_reset=0.0, detach_reset=True, tau=2.0, backend=torch
          (surrogate_function): Sigmoid(alpha=4.0, spiking=True)
        )

这是使用MultiStepLIFNode2的部分,可以看到v_threshold已经变成了nn.Parameter型,但是训练后我发现阈值还是没有变化,请问这有什么问题吗?

yazhuoguo commented 1 year ago

另外请问下,有训练阈值的先例吗?

fangwei123456 commented 1 year ago

训练阈值的文章挺多的,两年前就有了 代码看上去没有问题

yazhuoguo commented 1 year ago

好的,感谢!

fangwei123456 commented 1 year ago
import torch
import torch.nn as nn
from spikingjelly.activation_based import layer, neuron, surrogate

class IFNode(neuron.IFNode):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)

        v_threshold = self.v_threshold
        self.v_threshold = nn.Parameter(torch.as_tensor(v_threshold))

x_seq = torch.rand([8, 4], requires_grad=True)

net = IFNode(step_mode='m')

net(x_seq).sum().backward()

print(net.v_threshold.grad)

输出为

tensor(-10.1715)
yazhuoguo commented 1 year ago

按照您的示例,我在neuron.py里继承了MultiStepLIFNode类,但是经过nn.Parameter包装过后的阈值即使有梯度,还是没有得到更新,我的代码如图: 至于为什么没有在MultiStepLIFNode2类里重写forward函数,后在forward函数将阈值写进txt文件,是因为,如果这样,则第一张图片的850行的self.v_seq会变成tensor类型,报错说tensor没有append属性。

4 5 6

fangwei123456 commented 1 year ago

提供最小的复现你问题的代码和框架版本,不要以图片的形式放代码

yazhuoguo commented 1 year ago

抱歉,给您带来不便,现重新整理问题如下:

class MultiStepLIFNode(LIFNode): #篇幅原因只粘贴了部分forward函数

    def __init__(self, tau: float = 2., decay_input: bool = True, v_threshold: float = 1.,
             v_reset: float = 0., surrogate_function: Callable = surrogate.Sigmoid(),
             detach_reset: bool = False, backend='torch', lava_s_cale=1 << 6):

        super().__init__(tau, decay_input, v_threshold, v_reset, surrogate_function, detach_reset)
        self.register_memory('v_seq', None)
        check_backend(backend)
        self.backend = backend
        self.lava_s_cale = lava_s_cale

    def forward(self, x_seq: torch.Tensor):
        assert x_seq.dim() > 1
        # x_seq.shape = [T, *]

        if self.backend == 'torch':
            spike_seq = []
            self.v_seq = []
            for t in range(x_seq.shape[0]):
                spike_seq.append(super().forward(x_seq[t]).unsqueeze(0))
                self.v_seq.append(self.v.unsqueeze(0))
            spike_seq = torch.cat(spike_seq, 0)
            self.v_seq = torch.cat(self.v_seq, 0)
            '''将阈值写进txt文件以观察阈值是否得到更新'''
            with open("/root/cifar10/npresult.txt", "a") as f: 
                g = [self.v_threshold]
                g = torch.tensor(g)
                np.savetxt(f,np.array(g.cpu().detach()),fmt='%.04f')
            f.close()
            return spike_seq
class MultiStepLIFNode2(MultiStepLIFNode): #继承MultiStepLIFNode

    def __init__(self, *args, **kwargs):

        super().__init__(*args, **kwargs)
        v_threshold = self.v_threshold
        self.v_threshold = nn.Parameter(torch.as_tensor(v_threshold))

以下是在model里用到MultiStepLIFNode2的地方,所使用的框架版本是0.0.0.0.12:

self.attn_lif = MultiStepLIFNode2(tau=2.0, v_threshold=1.0, detach_reset=True, backend='torch')

这里MultiStepLIFNode2的阈值已经有了梯度,但经过训练发现,该阈值并没有改变,不知道是哪里的问题?阶跃函数在反向传播的过程中已经替换为sigmoid函数,损失函数对阈值求导按理说梯度不会是0?

fangwei123456 commented 1 year ago
from spikingjelly.clock_driven.neuron import *

class MultiStepLIFNode(LIFNode): #篇幅原因只粘贴了部分forward函数

    def __init__(self, tau: float = 2., decay_input: bool = True, v_threshold: float = 1.,
             v_reset: float = 0., surrogate_function: Callable = surrogate.Sigmoid(),
             detach_reset: bool = False, backend='torch', lava_s_cale=1 << 6):

        super().__init__(tau, decay_input, v_threshold, v_reset, surrogate_function, detach_reset)
        self.register_memory('v_seq', None)
        check_backend(backend)
        self.backend = backend
        self.lava_s_cale = lava_s_cale

    def forward(self, x_seq: torch.Tensor):
        assert x_seq.dim() > 1
        # x_seq.shape = [T, *]

        if self.backend == 'torch':
            spike_seq = []
            self.v_seq = []
            for t in range(x_seq.shape[0]):
                spike_seq.append(super().forward(x_seq[t]).unsqueeze(0))
                self.v_seq.append(self.v.unsqueeze(0))
            spike_seq = torch.cat(spike_seq, 0)
            self.v_seq = torch.cat(self.v_seq, 0)
            return spike_seq

class MultiStepLIFNode2(MultiStepLIFNode): #继承MultiStepLIFNode
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        v_threshold = self.v_threshold
        del self.v_threshold
        self.v_threshold = nn.Parameter(torch.as_tensor(v_threshold))

x_seq = torch.rand([8, 4], requires_grad=True)

net = MultiStepLIFNode2()
print(net)
print('threshold =', net.v_threshold)
optimier = torch.optim.SGD(net.parameters(), lr=0.1)

net(x_seq).sum().backward()

print(net.v_threshold.grad)
print('threshold.grad =', net.v_threshold)

optimier.step()

print('after bp, threshold =', net.v_threshold)

上面的代码是可以对阈值进行梯度下降的了。

我查了一下代码,在12版本中阈值被视作memory的一部分:

https://github.com/fangwei123456/spikingjelly/blob/1171f5249a4ebeace6ab8d6a74d85579fafb93ed/spikingjelly/clock_driven/neuron.py#L88

因此必须先调用 del self.v_threshold 把它从memory中清除,然后再设置成nn.Parameter.

yazhuoguo commented 1 year ago

ok,十分感谢您的帮助!

fangwei123456 commented 1 year ago

如果有问题的话可以再重新打开此issue

KaiSUN1 commented 3 months ago

你好,如果我堆叠几层,之前层是不是也会被del