Open zjcurtis opened 2 months ago
for a given amount of compute.
What do you exactly mean by this? That KANs are slower in training/inference?
for a given amount of compute.
What do you exactly mean by this? That KANs are slower in training/inference?
From the paper - "Currently, the biggest bottleneck of KANs lies in its slow training. KANs are usually 10x slower than MLPs, given the same number of parameters. We should be honest that we did not try hard to optimize KANs' efficiency though, so we deem KANs' slow training more as an engineering problem to be improved in the future rather than a fundamental limitation."
My interest is in discussing ways training can be made more efficient, and specifically whether there are function approximations that could train faster.
I'm actually investigating exactly this (and more) in #99 , so I do agree with you that it's something that should be tackled.
It seems that the current 'issue' with KANs is that they appear to be less performant in training for a given amount of compute. Perhaps there are alternative ways to approximate the functions of one variable, even if only for initialization?
I don't have concrete proposals, but it seemed like a worthwhile discussion to have!