google / objax

Apache License 2.0
768 stars 77 forks source link

How to compute Jacobian of outputs w.r.t. inputs #224

Closed smao-astro closed 3 years ago

smao-astro commented 3 years ago

Hi,

I am new to JAX and Objax, and I would like to compute the "partial derivative" of outputs w.r.t. inputs, below is a piece of code

import objax
import jax.numpy as jnp
import jax

m = objax.nn.Sequential([
    objax.nn.Linear(1, 10),
    objax.functional.elu
])

key = jax.random.PRNGKey(0)
x = jax.random.normal(key, (10, 1))

dydx = jax.vmap(jax.jacfwd(m))(x)

The doc suggests do not mix JAX and Objax's transformation, and my question is:

  1. I can not find a API in Objax do jacfwd or jacrev, so what is the standard way to calculate Jacobian?
  2. Why mixing JAX and Objax is discouraged, is it always a bad idea or in some case it is allowed and beneficial?
  3. I understand Objax is more object-oriented and stateful while Jax is stateless, but what's the difference of vmap and objax.Vectorize?

Thanks.

AlexeyKurakin commented 3 years ago
  1. Compute Jacobian.

You can reimplement object-oriented version of jacfwd or jacrev similarly how objax.GradValues is implemented.

Another option is to use combination of objax.Vectorize and objax.GradValues, in other words vectorize computation of gradient. DPSGD gradient module to some extend does it: https://github.com/google/objax/blob/c4785ff991f35dc1af5a68988e8a545d3304de90/objax/privacy/dpsgd/gradient.py#L76

  1. Why not mixing JAX and Objax

You can not use functional jax transformations with Objax (like jax.vmap, jax.pmap, jax.grad, etc...). In other words those transformations which takes function and returns a new function. You can use other jax operations (for example all stuff from jax.numpy.*) safely with Objax.

All JAX primitives are stateless and pure functional (i.e. don't have and don't assume side-effects). Objax provides wrappers for JAX primitives to simplify state management and make is more natural for machine learning applications.

So if you try to mix JAX functional tranformations with Objax primitives it will break the state management and either code won't work at all or will work incorrectly.

  1. Difference between JAX and Objax primitives

As I mentioned above, Objax provides wrappers which simplify state management. So for example objax.Vectorize is a wrapper over jax.vmap which does the state management and enables usage of stateful Objax primitives with stateless JAX.

smao-astro commented 3 years ago

I see, thank you for your explanation!

chaoming0625 commented 2 years ago
  1. Compute Jacobian.

You can reimplement object-oriented version of jacfwd or jacrev similarly how objax.GradValues is implemented.

It is hard to implement object-oriented version of jacfwd or jacrev, because jacfwd or jacrev do not support return auxiliary data. Is there solution? Thanks!