KindXiaoming / pykan

Kolmogorov Arnold Networks
MIT License
14.56k stars 1.33k forks source link

Is it only suitbale for small-scale model? #59

Closed HelloWorldLTY closed 2 months ago

HelloWorldLTY commented 4 months ago

Hi, thanks for your great work. I am thinking about implementing a KAN with 3072 as input dims and 2000 as output dims. Do you think the GPU is capable for running it? I have tried fourierKANs but always got killed.

KindXiaoming commented 4 months ago

Hi, I found this linear projection trick from GraphKAN useful: https://github.com/WillHua127/GraphKAN-Graph-Kolmogorov-Arnold-Networks. In short, it's better to implement KAN in latent space so it would be nice to first linear map 3072D to some low-dimension space (latent space), use KAN to process the information in the latent space, and then use a linear layer to map back to 2000. Also, just a bit curious, what is your dataset about? Where does this high dimension come from?

HelloWorldLTY commented 4 months ago

Thanks, I have tried enisum but failed. I will try your suggestions as well as suggestions from fourier transformer.

I think combining KAN with MLP is a good idea and I am exploring it.

The high dimension comes from the embeddings of LLMs. 3000+ is not very high, for a matrix with genes as features, the dimension will be approx 20,000.

HelloWorldLTY commented 4 months ago

Also, reducing the grid size works for me.