SynodicMonth / ChebyKAN

Kolmogorov-Arnold Networks (KAN) using Chebyshev polynomials instead of B-splines.
348 stars 36 forks source link

ChebyKANs do not support continual learning #8

Open iiisak opened 5 months ago

iiisak commented 5 months ago

It seems like ChebyKANs do not support continual learning. This is probably because b-splines are inherently a local activation function, while Chebyshev expansion is not. https://colab.research.google.com/drive/1lSYvnOmfoRnmrvBx-zkUTpavkjmfpNT1

SynodicMonth commented 5 months ago

It truely is! Another reason why the original KAN uses b-spline! (Maybe we can make cheby local as well. Im considering if its possible)

iiisak commented 5 months ago

(Maybe we can make cheby local as well. Im considering if its possible)

@SynodicMonth What about a Chebyshev polynomial grid? Continual learning in the original KAN paper comes from its grid. If we define T_n(x) as cos(n * acos(x)) for |x| <= 1 and 0 otherwise, we can build a grid by summing offset Chebyshev polynomials (Maybe this can be vectorized too)

P.S.: Grids would also remove the need for `tanh, as the domain of a Chebyshev polynomial grid is no longer limited to [-1, 1]

yuedajiong commented 2 months ago

Yes we can use cos(n * acos(x)) as T_n(x), but we know, we must use taylor-expansion or other algorihtms to compute acos and cos, is this a effient way?

iiisak commented 2 months ago

Yes we can use cos(n * acos(x)) as T_n(x), but we know, we must use taylor-expansion or other algorihtms to compute acos and cos, is this a effient way?

@yuedajiong That's unrelated to this issue, but to answer your question: Yes, the trig definition of Chebyshev polynomials is more efficient than the recursive one, as modern GPUs have fast hardware trig functions implemented.