KindXiaoming / pykan

Kolmogorov Arnold Networks
MIT License
15.07k stars 1.39k forks source link

KANvolver leads to 99.5% accuracy on MNIST dataset #180

Closed 1ssb closed 4 months ago

1ssb commented 6 months ago

Please check for reproduction and methods to further optimise: https://github.com/1ssb/torchkan

AthanasiosDelis commented 6 months ago

Congratulations, @1ssb ! Epoch 13 99.44% Val , with my edition of train_mnist.py running MNIST, Kanvolver is for real. image

image

Two questions:

1) How are you going to proceed?

2) KANvolver, FastKAN, FasterKAN, and the other variations: when do you believe the variants have mutated so much from the original pyKAN that they can no longer be considered KAN? Does Kanvolver, FastKAN, or FasterKAN still subscribe to the KATherorem as rephrased in the pyKAN paper?

@KindXiaoming @Blealtan @ZiyaoLi your opinion would also be highly appreciated!

1ssb commented 6 months ago

We must come up with a mathematical intuition moving further, they all still follow general additive model structures so yes. But I firmly believe if we do not sufficiently understand the process, we will be back to square 1, just like with MLPs.

Best Subhransu


From: Athanasios Delis @.> Sent: Tuesday, May 14, 2024 6:32:01 AM To: KindXiaoming/pykan @.> Cc: Subhransu Bhattacharjee @.>; Mention @.> Subject: Re: [KindXiaoming/pykan] KANvolver leads to 99.5% accuracy on MNIST dataset (Issue #180)

You don't often get email from @.*** Learn why this is importanthttps://aka.ms/LearnAboutSenderIdentification

Congratulations, @1ssbhttps://github.com/1ssb ! Epoch 13 99.44% Val , with my edition of train_mnist.py running MNIST, Kanvolver is for real. image.png (view on web)https://github.com/KindXiaoming/pykan/assets/68398213/bdd71456-8ccd-484f-ab93-473bf5bc5468

image.png (view on web)https://github.com/KindXiaoming/pykan/assets/68398213/8dd40bf0-c030-4823-85cd-31c56a712a7a

Two questions:

  1. How are you going to proceed?

  2. KANvolver, FastKAN, FasterKAN, and the other variations: when do you believe the variants have mutated so much from the original pyKAN that they can no longer be considered KAN? Does Kanvolver, FastKAN, or FasterKAN still subscribe to the KATherorem as rephrased in the pyKAN paper?

@KindXiaominghttps://github.com/KindXiaoming @Blealtanhttps://github.com/Blealtan @ZiyaoLihttps://github.com/ZiyaoLi your opinion would also be highly appreciated!

— Reply to this email directly, view it on GitHubhttps://github.com/KindXiaoming/pykan/issues/180#issuecomment-2108748206, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AJWHFEBGOPB2JTWKUXDHMZLZCEPMDAVCNFSM6AAAAABHUFDSQGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBYG42DQMRQGY. You are receiving this because you were mentioned.Message ID: @.***>

yuedajiong commented 6 months ago

Could you please express your experimental conclusions, and future analysis, in simpler/plainer terms?

my understanding here, you can translate it by ChatGPT; if needed, I can translate it in English, https://github.com/KindXiaoming/pykan/issues/86

shortly: MLP: mul(as_whole) KAN: add(independent_inputs)

part, traslated by ChatGPT
”Additive representations predominantly accumulate quantities of similar objects in terms of quantity, while multiplicative representations primarily represent the combined transformations between different physical quantities.“

"If dealing with a highly abstract space, where the black box nature of things and numerous hard-to-understand variables are not a concern, I personally lean towards using MLP (multi-layer perceptron) as a multiplicative building block for fitting. Conversely, in shallower layers or layers biased towards output, where I know the space between inputs and outputs can be obtained through a finite number of transformations and I want to 'force' the relationship with inputs, then KAN (kernel additive network) can be used. However, if the relationship with inputs is not directly strong, applying various regularizations to MLP makes it easier to 'discover' some 'combined/composite' features."

1ssb commented 6 months ago

If you are asking for my experimental conclusions; let me very clear, I do not have one. I have barely started the analysis and the only way to provide one is to extensively work on the comparisons and visualizations.

yuedajiong commented 6 months ago

@lssb Thanks! You're doing great, I'll keep an eye on your work.