-
Hey, thanks for the great work!
I'm using BatchNorm in my network, but have set the `use_running_average` parameter of BatchNorm layers to true, which means it will not compute any running mean/std…
-
I am confused by the behavior of the following snippet of code (the WideResNet from the README with standard parameterization):
```python
import jax
from neural_tangents import stax
def Wide…
-
Thanks for your great idea and detailed work, and I hope you are enjoying your day so far.
I have a question regarding your paper "Dataset Distillation by Matching Training Trajectories". In the th…
-
Hi,
According to the Erf() in stax, I want to confirm the implementation.
When we consider 2-layer MLP without training the last layer, the NTK is the covariance matrix of the data multiplied by
…
-
Dear lulu,DeepXDE is a great tool.However, I have some questions. I am solving a 2-dimensional heat equation, and I want to use spatial coordinates and boundary conditions as input to the neural netw…
-
After updating my environment to work with a more recent version of JAX and FLAX, I have noticed that empirical the NTK Gram matrices computed using `nt.batch` applied to `nt.empirical_kernel_fn` are …
-
Hi, would you mind explaining the difference between a general `Kernel` type and a `` type? Thanks ahead!
-
As can be seen in the example below, I expected that decorating a kernel function with `batch` doesn't change the return shape of the kernel function, i.e. both `res_5` and `res_10` below should have …
-
Hi, thanks for this handy library. I have a basic question. I was trying to apply nngp on graph convolutional layers. For example,
```
# graph-nngp
init_fn, apply_fn, kernel_fn = stax.serial(
…
-
I am truly confused about the difference between ```gradient_descent_mse``` and ```gradient_descent_mse_ensemble```.
In the original NNGP and NTK paper, the author mentioned that we can use Gaussi…