Open NicolaCST opened 1 year ago
Can you provide the minimal codes to reproduce the error? I just test the IF neuron in the master version. It will not raise errors:
from spikingjelly.activation_based import surrogate, neuron
import torch
net = neuron.IFNode(backend='cupy', step_mode='m', surrogate_function=surrogate.ATan())
x = torch.rand([8, 4], device='cuda:0', requires_grad=True)
y = net(x)
y.sum().backward()
This is my net definition.
from spikingjelly.activation_based import neuron, encoding, functional, surrogate, layer
from spikingjelly import visualizing
import torch
import torch.nn as nn
import torchvision
import numpy as np
from torch.utils.tensorboard import SummaryWriter
from torch.cuda import amp
import torch.nn.functional as F
class SNN(nn.Module):
def __init__(self, T:int):
super().__init__()
self.T = T
self.layer = nn.Sequential(
layer.Flatten(),
layer.Linear(28 * 28, 10, bias=False),
neuron.IFNode(surrogate_function=surrogate.ATan()))
functional.set_step_mode(self, step_mode="m")
functional.set_backend(self, backend="cupy")
#----------- HERE (2) -------- #
def forward(self, x:torch.Tensor):
x_seq = x.unsqueeze(0).repeat(self.T, 1, 1, 1, 1)
x_seq = self.layer(x_seq)
return x_seq.mean(0)
net = SNN(T=5)
The train loop is defined as:
def only_train_multistep (net, epochs, train_data_loader, device, writer):
scaler = amp.GradScaler()
functional.reset_net(net)
print("Training new mode...")
net.train()
for epoch in range(epochs):
print("Epoch {}:".format(epoch))
train_loss = 0
train_acc = 0
train_samples = 0
for img, label in train_data_loader:
optimizer.zero_grad()
img = img.to(device)
label = label.to(device)
label_onehot = F.one_hot(label, 10).float()
if scaler is not None:
with amp.autocast():
#----------- HERE (1) -------- #
encoded_img = encoder(img)
out_fr = net(encoded_img)
loss = F.mse_loss(out_fr, label_onehot)
scaler.scale(loss).backward()
scaler.step(optimizer)
scaler.update()
train_samples += label.numel()
train_loss += loss.item() * label.numel()
train_acc += (out_fr.argmax(1) == label).float().sum().item()
functional.reset_net(net)
train_loss /= train_samples
train_acc /= train_samples
return
And then i call
writer = SummaryWriter(RUNS_PATH)
optimizer = torch.optim.Adam(net.parameters(), lr=1e-3)
encoder = encoding.PoissonEncoder()
train_data_loader = data.DataLoader(
dataset = train_dataset,
batch_size = 128,
shuffle = shuffle,
drop_last = True)
train_ls = only_train_multistep(net, epochs=10, train_data_loader, device='cuda:0', writer)
Im sorry, this is a bit verbose but should be helpful. The point that i modified are the HERE(1) - where i removed the loop over T, since it's on multistep mode - and the HERE (2) - where by following the tutorial on CSNN, i've added the dimension of T to the tensor.
By adding some prints, the error raises during the call of scaler.scale(loss).backward(). However I am not sure if the forward part has been implemented correctly
/content/spikingjelly/spikingjelly/activation_based/auto_cuda/base.py in append(self, codes)
1470 Append codes in self.codes
.
1471 """
-> 1472 codes = codes.replace('\n', '')
1473 codes = codes.split(';')
1474 for i in range(codes.len()):
AttributeError: 'NoneType' object has no attribute 'replace'
Hi, I find this bug, which is caused by not returning. Now I have fixed it.
The tutorial for MNIST with fc SNN is available now:
https://spikingjelly.readthedocs.io/zh_CN/latest/activation_based_en/lif_fc_mnist.html
Issue type
SpikingJelly version 'latest'
Description
Hi, i'm trying to implement the multi-step mode on a single FC SNN. Im currently following the old tutorial (0.0.0.0.12) but im implementing it in the newest version. I can successfully train the net in single-step mode, however i got this error when switching to multi-step
--- error ---
---> 19 train_ls = only_train_multistep(net, epochs, train_data_loader, device, writer) 9 frames /content/spikingjelly/spikingjelly/activation_based/auto_cuda/base.py in append(self, codes) 1470 Append codes in
self.codes
. 1471 """ -> 1472 codes = codes.replace('\n', '') 1473 codes = codes.split(';') 1474 for i in range(codes.len()): AttributeError: 'NoneType' object has no attribute 'replace'--- error ---
Since there are no tutorial available with this version, i tried to follow as close as possible the definition of the train loop and the net from the CSNN (since it's in multi-step mode)
Thanks in advance
@fangwei123456
Minimal code to reproduce the error/bug
'''
My net: class SNN(nn.Module): def init(self, T:int): super().init()
My train loop: for epoch in range(epochs): print("Epoch {}:".format(epoch))
[.......] '''