chenziwenhaoshuai / Vision-KAN

KAN for Vision Transformer
MIT License
213 stars 13 forks source link

CUDA out of memory. #1

Open ybu-lxd opened 4 months ago

ybu-lxd commented 4 months ago

Hello, I found that when the number of tokens reaches a certain size, the following situation will occur.

CUDA out of memory. Tried to allocate 1.12 GiB. GPU 0 has a total capacty of 23.65 GiB of which 730.19 MiB is free.
chenziwenhaoshuai commented 4 months ago

pls show me more details

ybu-lxd commented 4 months ago

pls show me more details

When I use fastKan, the memory usage is extremely high, which occurs when the number of tokens reaches 4096. CUDA out of memory.

fkan1 = NaiveFourierKANLayer(768, 768, 300).to("cuda")
x = torch.rand(size=(4,49,768)).to("cuda")
print(fkan1(x).shape)
chenziwenhaoshuai commented 4 months ago

Yes, it's normal, that's why I can only set the hidden layer very small on my local computer, like 192, I'm sure more work will be done in the near future to lighten up the GPU memory usage, but not now!

ybu-lxd commented 4 months ago

Yes, the number of tokens will affect the performance of the network. I think kan is doing a good job, but it is now limited by engineering.

ybu-lxd commented 4 months ago

I'm trying to use kan for image generation, but the effect has been poor due to the limitation of the number of tokens.

chenziwenhaoshuai commented 4 months ago

Yes, I'm not currently getting good classification accuracy either, need to experiment more.