Open jecampagne opened 2 years ago
Welcome! This is a good channel - we also have https://github.com/google/neural-tangents/discussions, either place is OK.
emp_ntk_kernel_fn
is the finite-width kernel function; it returns the outer product of Jacobians of apply_fn
wrt parameters theta
, and it depends on specific parameters theta
, so you'll call it like emp_ntk_kernel_fn(x1, x2, theta)
. In short, this kernel describes the behavior of the linearization of apply_fn
around theta
. This is also \hat Theta
from https://arxiv.org/pdf/1902.06720.pdf.
kernel_fn
returned by stax
is the infinite-width kernel function, namely, for the same architecture, kernel_fn(x1, x2) = plim_{n->infty} empirical_ntk_fn(x1, x2, theta)
assuming that theta ~ N(0, 1)
, i.e. weights are i.i.d. Gaussians. (Minor note, this is assuming parameterization='ntk'
, 'standard'
is slightly different per https://arxiv.org/pdf/2001.07301.pdf.). I.e. the random variable empirical_ntk_fn(x1, x2, theta)
(given random normal theta
) converges in probability to a constant kernel kernel_fn(x1, x2)
. In short, this kernel describes the behavior of the infinite ensemble of infinite widths apply_fn
networks. This is also Theta
from https://arxiv.org/pdf/1902.06720.pdf.
Hope this helps!
Great! look at my last post. Thanks
Hello, I'm a newby with your library which is really looks nice indeed and I would like to take benefit of it to make some exercices to illustrate private lecture for some colleagues. So , I would'nt like to make rough mistakes (Notice that I have posted question also in the same spirit). Let me know if there is a forum dedicated to this kind of user exchanges somewhere.
So, after
one can do
If I am right the
emp_ntk_kernel_fn
is the finite size NTK kernel based on the Network Architecture, but then what is the difference with thekernel_fn
ie the third return argument ofstax.serial
?Thanks