Closed Aniosss closed 1 month ago
Some benchmarks that might be helpful: https://github.com/GistNoesis/FusedFourierKAN/issues/4
The activation is a bit (or more, if your VRAM bandwidth is limited) slower, and the number of parameters is (by default) 5x over nn.Linear with the same input and output size, so do the FLOPs count. I didn't do real benchmarks.
Also, AFAIK chebbykan
is doing the same way as my impl, which is also the second best in the benchmark by @Jerry-Master mentioned above. The results seem reasonable to me.
Hey, I want to use your implementation, do you know how much slower the learning can be compared to nn.linear?