Open kabachuha opened 1 month ago
Thanks for implementing this! This is one experiment in my mind but never got a chance to do. The score field is kind of multiscale, which might be why KANs can outperform MLP. Would also be fun to look at approximation errors for KANs and MLPs in near/immediate/far fields? Do KANs outperform MLPs mostly in nearly fields (If what I said above makes sense)?
I'm not so well-versed in the math base myself, is it something like looking at losses for each timesteps?
Anyway, I think it would be a great community experiment
This is not strictly generalised. I tried it with an image diffusion model replacing the MLPs with the equivalently layered KANs, the generalisation is poor. I have elaborated on a more pathological example in the functional approximation. MLPs are still outperforming KANs. Any insights?
MLPs are still outperforming KANs.
Wow, a single example proves that an entire branch of techniques and methods are useless.
No, I think it just proves we still do not have enough insights to define optimality in terms of structure for the target, which is hopefully what the goal is, correct me if I am wrong.
I think generative modeling is particuarly subtle because these two goals are not necessarily aligned: (1) fit score function well; (2) being able to generalize. My intuition is that KANs are good at (1) but not necessarily (2) (which is the true goal of generative modeling). This experiment seems like a good starting point, definitely agree this is a community effort. Again, great initiative!
I'm not so well-versed in the math base myself, is it something like looking at losses for each timestep?
@kabachuha were u able to find this?
Hi! I made a test of how KAN could assist denoising diffusion and adapted the very initial spiral-diffusion model utilizing an MLP to KAN.
You can see that a two layer KAN fares almost as good as a 4-layer MLP (despite having 30% less parameters), and the 4-layer KAN vastly outperforming it.
I think it would be worth to add a notebook with diffusion to this repository, at least for educational purposes, and for gaining more mainstream attention.
Additionally, it could be nice to explore if it learned some good functions in the layers.
https://github.com/kabachuha/kan-diffusion
Same structure KAN:
MLP:
2-layer KAN: