Open jagdap opened 5 months ago
Digging a little deeper, I think there are two issues at play. First, SpikingNeuron.instances
is being populated but not cleared, so the class variable is growing each time I create a new network with a SpikingNeuron
child, causing memory usage to grow. This is especially painful when using Jupyter Notebook, since I tend to re-run cells that create models.
The other issue might be a PyTorch issue, but I'm not sure yet. One of the things done in Leaky
is "selection" of which function will be used for a reset mechanism. This is assigned via self.state_function = self._base_sub
(or whichever is selected). The act of assigning a function to a variable is creating some sort of memory leak. I've recreated the problem below:
class myModule(nn.Module):
def __init__(self):
super().__init__()
self.register_buffer("test", torch.as_tensor(1.0))
self.which_foo = self.foo
def forward(self, x):
return x
def foo(self, x):
return
for _ in range(3):
m = myModule().cuda()
del m
torch.cuda.empty_cache()
print("\n")
print("mem alloc:", torch.cuda.memory_allocated())
print("max mem alloc:", torch.cuda.max_memory_allocated())
Output:
mem alloc: 512
max mem alloc: 512
mem alloc: 1024
max mem alloc: 1024
mem alloc: 1536
max mem alloc: 1536
Of note, the error only occurs when I've registered a buffer. Strangely, if I uncomment the self.register_buffer
line, there's no longer an issue.
Description
Memory usage in GPU grows unexpectedly when using Leaky() neuron.
What I Did
After noticing unexpected memory growth with my SNN, I tracked down an memory leakage/garbage collect issue with the Leaky neuron, demonstrated with the code below:
Output:
Notice that the memory usage on the GPU is growing even when explicitly deleting the neuron.
Expected Behavior
We would expect that dereferencing or using
del
would remove the Leaky neuron from all memory. This is the behavior observed with thetorch.nn.Linear
neuron, for example:Output:
I suspect part of the issue is how instances are handled and tracked by the
snntorch.SpikingNeuron
class, but the behavior is observed even after deleting all instances.