Open Jice-Zeng opened 2 months ago
Hi,
Yes, you are right. When refactoring the package for the submission, I introduced an error in make_maf
. I push a fix later. This should be
base_distribution = distrax.Independent(
distrax.Normal(jnp.zeros(curr_dim), jnp.ones(curr_dim)),
1,
)
and not n_dimension
.
Cheers, Simon
Yes, it works after I changed n_dimension into curr_dim. I also tried to implement different surjective layer, such as MaskedCouplingInferenceFunnel, as you see the codes I pasted in the issue. Even I modified n_dimension into curr_dim, I still got error:
75 def _bijector_fn(params):
76 # print(params.shape)
---> 77 means, log_scales = unstack(params, -1)
80 return distrax.ScalarAffine(means, jnp.exp(log_scales))
ValueError: too many values to unpack (expected 2)
Do you have any idea for implementing MaskedCouplingInferenceFunnel?
The issue here is that a MADE network outputs different paramater shapes than a MLP. I think in the case, you need to use the following bijector
def bijector_fn(params):
shift, log_scale = jnp.split(params, 2, axis=-1)
return distrax.ScalarAffine(shift, jnp.exp(log_scale))
Thanks for reporting all this! I need to put all that in the docu
The issue here is that a MADE network outputs different paramater shapes than a MLP. I think in the case, you need to use the following bijector
def bijector_fn(params): shift, log_scale = jnp.split(params, 2, axis=-1) return distrax.ScalarAffine(shift, jnp.exp(log_scale))
Hi Dirmeier,
I think I misled you. The original make_maf works well after changing n_dimension
into curr_dim
. The function of make_maf uses MaskedAutoregressive
as bijective layer, and AffineMaskedAutoregressiveInferenceFunnel
as surjective layer. The def _bijector_fn(params): means, log_scales = unstack(params, -1) return distrax.ScalarAffine(means, jnp.exp(log_scales))
also works well.
My issue is that I tried to implement affine (surjective) masked coupling flow
, with bijective layer of MaskedCoupling
and surjective layer of MaskedCouplingInferenceFunnel
, I open another issue #46 to avoid confusion for other readers.
Hi Simon, I am trying to run the example of SCLP using surjective layers, actually the file slcp-snle. y_obs = jnp.array([[ -0.9707123, -2.9461224, -0.4494722, -3.4231849, -0.13285634, -3.364017, -0.85367596, -2.4271638, ]]) fns = prior_fn, simulator_fn
neural_network = make_maf(8, n_layer_dimensions=[8, 8, 5, 5, 5]) snl = NLE(fns, neuralnetwork) optimizer = optax.adam(1e-4) data, = snl.simulate_data(jr.PRNGKey(0), n_simulations=10_000) params_t, losses = snl.fit( jr.PRNGKey(0), data=data, n_early_stopping_patience=100, optimizer=optimizer, n_iter=1000 ) Unfortunately, I got an error: ValueError: Incompatible shapes for broadcasting: shapes=[(100, 5), (8,)]
If I design all layers to be bijective, such as n_layer_dimensions=[8, 8, 8, 8, 8]. the implementation works well. So I guess the issue resulted from the use of surjective layer. I tried many times to modify the codes, but the error still persisted, can you give me some ideas? Thanks!
I also tried another nerual network: MaskedCouplingInferenceFunnel below:
I got error of ValueError: too many values to unpack (expected 2). Thanks for the contribution to the library and looking forward to your reply.