bayesiains / nflows

Normalizing flows in PyTorch
MIT License
849 stars 118 forks source link

Flow identity initialization #72

Open francesco-vaselli opened 1 year ago

francesco-vaselli commented 1 year ago

Hello and thanks for the work on the package,

I am doing some tests with the identity initialization for rational quadratic splines. When using the new identity init implemented in #65 , giving in input x = torch.tensor([1, 1e-2 ,1e-6, 1e-8, 1e2], dtype=torch.float32) the following is the inverse for the untrained network (which should be initialized as the identity):

# in spline def: enable_identity_init=True

# transform back
flow.transform_to_noise(x.view(-1,1))
tensor([[  1.7013],
        [  1.3796],
        [  1.3739],
        [  1.3739],
        [100.0000]], grad_fn=<AddmmBackward0>)

If instead I manually set the weights of the last layer to 0 in the last layer of the transform network (as done in the normflows package) I get the identity as expected:

# in spline def: enable_identity_init=False

# in the model def
        if init_identity:
          torch.nn.init.constant_(autoregressive_net.final_layer.weight, 0.0)
          torch.nn.init.constant_(
              autoregressive_net.final_layer.bias,
              np.log(np.exp(1 - min_derivative) - 1),
          )

# stuff

# transform back
flow.transform_to_noise(x.view(-1,1))
tensor([[1.0000e+00],
        [1.0000e-02],
        [1.0000e-06],
        [1.0000e-08],
        [1.0000e+02]], grad_fn=<AddmmBackward0>)

I was wondering whether you could help me figure out this difference in behavior. If this seems potentially useful I can work more than gladly on a pull request. Best regards, Francesco

arturbekasov commented 1 year ago

Hey Francesco,

Yeah, this is expected, we don't explicitly initialize the autoregressive_net to identity. The flag applies to the spline function only.

I think having a way to initialize the whole flow to (near?) identity could be useful. I'd happily review a PR on this.

Cheers,

Artur