Adding L2 Penalty to mimic GAM P-Spline?

ZiyaoLi / fast-kan

FastKAN: Very Fast Implementation of Kolmogorov-Arnold Networks (KAN)

Apache License 2.0

283 stars 31 forks source link

Adding L2 Penalty to mimic GAM P-Spline? #10

Open thipokKub opened 3 weeks ago

thipokKub commented 3 weeks ago

See this post

KAN use B-spline as an underlying activation. But the default GAM use P-spline. It is possible that the performance difference is because of the L2 penalty on the coefficient weight?

ZiyaoLi commented 3 weeks ago

just personally for me, all implementation with different splines / rbfs does NOT ESSENTIALLY change the performance (accuracy). Arguing some of them being better than others is somehow similar to changing activations in MLPs: are state-of-the-art SiLUs really better than ReLUs? who knows.

btw the robustness / generalizability of KANs against MLPs or other benchmarks are never really studied. Your suggestion of the l2 penalty on coeffs seems reasonable.