KindXiaoming / pykan

Kolmogorov Arnold Networks
MIT License
13.71k stars 1.21k forks source link

Can you give me some tips about the "2n + 1" in Eq. 2.1 #60

Open nasyxx opened 2 months ago

nasyxx commented 2 months ago

Hi,

I recently came across your fascinating paper and thoroughly enjoyed reading it. The insights you presented were truly thought-provoking.

However, I am hoping to gain some clarity. In Equation 2.1, the upper (or layer width) of q is set to 2n + 1. I tried to find more information about this in the references you provided, but unfortunately, I couldn't locate any specific details.

I was wondering if you could kindly provide me with some tips or guidance on understanding the reasoning behind this particular choice of the upper limit. Any additional context or resources you could share would be greatly appreciated.

Thank you in advance for your time and consideration.

woshitff commented 2 months ago

https://www.bilibili.com/video/BV1NH4y1G7dB/?spm_id_from=333.788&vd_source=9f0fca70d4c6dcb4911c1f21543f484a It may help you.

nasyxx commented 2 months ago

https://www.bilibili.com/video/BV1NH4y1G7dB/?spm_id_from=333.788&vd_source=9f0fca70d4c6dcb4911c1f21543f484a

It may help you.

I just checked. He didn't mention why it is 2n+1.

kolmogorov-quyet commented 2 months ago

To understand why, imagine you're trying to approximate a function f defined on a 2-dimensional square. You can think of f as a surface above the square. To approximate this surface, you might use a combination of simple functions, like polynomials or trigonometric functions, to create a "patchwork" of surfaces that approximate the original function.

In this case, you might need 4 terms (2^2 + 1) to get a good approximation: one term for each quadrant of the square, plus one extra term to account for the "glue" that holds the quadrants together.

As you move to higher-dimensional spaces, the number of terms needed to approximate a function grows exponentially. This is because the number of "quadrants" or "patches" needed to cover the space grows exponentially with the dimensionality.

In the case of an n-dimensional cube, you need 2^n "quadrants" or "patches" to cover the space, plus one extra term to account for the "glue" that holds everything together. This is why the theorem states that you need 2^n + 1 terms to represent a function on an n-dimensional cube.I hope you understand the problem, anyway, the technique in the author's paper does not clearly indicate many theories