Closed AlexWang1900 closed 2 months ago
所以这个地方还没有优化是吗?非常耗显存。
How much RAM did nano-gpt use before the change? Have you tried to use the smallest setting of nano-gpt?
I also encountered this problem, have you solved it?
The same problem. My input is a tensor whose shape is 8000x36 while my kan network is:
kan_net = KAN(width=[36, 32, 10], grid=5, k=3, seed=0, device=torch.device('cuda'))
Finally, the network will eat a huge number of memories.
@ybu-lxd @STQ-AmadeusUser How much RAM did nano-gpt use before the change? Have you tried to use the smallest setting of nano-gpt?
You can have a look into other transformer combinations with KAN like a-r-r-o-w/kanformer and kanformers.
I replaced a mlp of kanlayer like this: `class MLP_KAN(nn.Module):
using above layer instead of original MLP `class MLP(nn.Module):
and it run out of memory : File "/home/alex/Projects/pykan-master/kan/spline.py", line 60, in B_batch value = (x - grid[:, :-(k + 1)]) / (grid[:, k:-1] - grid[:, :-(k + 1)]) B_km1[:, :-1] + (grid[:, k + 1:] - x) / (grid[:, k + 1:] - grid[:, 1:(-k)]) B_km1[:, 1:] torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 22.50 GiB. GPU 0 has a total capacty of 23.64 GiB of which 13.85 GiB is free. Including non-PyTorch memory, this process has 9.05 GiB memory in use. Of the allocated memory 8.58 GiB is allocated by PyTorch, and 25.99 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF