The Position of Model Initialization Code Matters？

KindXiaoming / pykan

Kolmogorov Arnold Networks

MIT License

14.43k stars 1.32k forks source link

The Position of Model Initialization Code Matters？ #381

Closed Rashfu closed 3 weeks ago

Rashfu commented 1 month ago

I used the simple example from the tutorial. When I simply moved the line model = KAN(width=[2,5,1], grid=5, k=3, seed=1), I was able to get better fitting results. Should we always define the dataset first and then initialize the model? I'm not sure what is happening.

KindXiaoming commented 1 month ago

I guess seeding plays subtle roles. Please try other seeds and see if this persists.

Rashfu commented 1 month ago

Thanks for your prompt reply!

I tried using different seeds, and some results were normal while others resulted in redundant functions. Could you explain which parts of the algorithm are affected by changing the seed, leading to instability in the optimization results? Based on my experience with training neural networks, the seed should only affect the training process but should not change the network’s final fitting performance. Can I assume that KAN is currently unable to consistently optimize to a local optimum?

KindXiaoming commented 1 month ago

A few possibilities: (1) small networks (no matter KANs or MLPs or whatever) have more bad local minima than overparameterized networks. (2) LBFGS may get stuck at local minima. (3) I don't know if there is anything specific to KAN that makes it inconsistent.