fangwei123456 / spikingjelly

SpikingJelly is an open-source deep learning framework for Spiking Neural Network (SNN) based on PyTorch.
https://spikingjelly.readthedocs.io
Other
1.35k stars 239 forks source link

Q: About cuda out of memory/after functional reset #246

Closed mountains-high closed 2 years ago

mountains-high commented 2 years ago

Hello ~ While using the pre-trained model on SJ, I encountered this issue. Could you guide me to solve this problem, please? Thank you very much

Tried to solve via:

optimizer.zero_grad()
loss.backward()
optimizer.step()
functional.reset_net(net) #this

SJ version: spikingjelly 0.0.0.0.4

Traceback (most recent call last):
  File "/home/user/Downloads/SJ_base_pruned_CIFARmodel/check_accuracy.py", line 266, in <module>
  File "/home/user/anaconda3/envs/sew/lib/python3.9/site-packages/torch-1.10.2-py3.9-linux-x86_64.egg/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/user/Downloads/SJ_base_pruned_CIFARmodel/models.py", line 64, in forward
    out_spikes_counter += self.boost(self.fc(self.conv(x)).unsqueeze(1)).squeeze(1)
  File "/home/user/anaconda3/envs/sew/lib/python3.9/site-packages/torch-1.10.2-py3.9-linux-x86_64.egg/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/user/anaconda3/envs/sew/lib/python3.9/site-packages/torch-1.10.2-py3.9-linux-x86_64.egg/torch/nn/modules/container.py", line 141, in forward
    input = module(input)
  File "/home/user/anaconda3/envs/sew/lib/python3.9/site-packages/torch-1.10.2-py3.9-linux-x86_64.egg/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/user/anaconda3/envs/sew/lib/python3.9/site-packages/spikingjelly-0.0.0.0.4-py3.9.egg/spikingjelly/clock_driven/neuron.py", line 194, in forward
  File "/home/user/anaconda3/envs/sew/lib/python3.9/site-packages/spikingjelly-0.0.0.0.4-py3.9.egg/spikingjelly/clock_driven/neuron.py", line 131, in neuronal_reset
RuntimeError: CUDA out of memory. Tried to allocate 256.00 MiB (GPU 1; 10.92 GiB total capacity; 9.50 GiB already allocated; 94.00 MiB free; 10.01 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Thank you

fangwei123456 commented 2 years ago

File "/home/user/Downloads/SJ_base_pruned_CIFARmodel/models.py", line 64, in forward out_spikes_counter += self.boost(self.fc(self.conv(x)).unsqueeze(1)).squeeze(1)

Try to use the smaller batch size or T.

mountains-high commented 2 years ago

Thank you ~ It has been solved by reducing the batch size, T=8.