probabilists / zuko

Normalizing flows in PyTorch
https://zuko.readthedocs.io
MIT License
309 stars 23 forks source link

Do NSF use coupling layers or autoregressive layers? #2

Closed michaeldeistler closed 1 year ago

michaeldeistler commented 1 year ago

Description

I'm a bit confused about the NSF module of the package. Does it use autoregressive transforms or coupling layers (as in the original NSF paper)? The NSF are signficantly faster at log_prob, so I'm guessing it is autoregressive layers. If so, are coupling layers currently implemented in the toolbox?

Reproduce

import torch
import zuko
import time

dim_density = 200

x = torch.randn(dim_density)
y = torch.randn(5)

# Neural spline flow (NSF) with 3 transformations
flow = zuko.flows.NSF(dim_density, 5, transforms=1, hidden_features=[128] * 2)

# Sample 64 points x ~ p(x | y)
start_time = time.time()
x = flow(y).sample((64,))
print(time.time() - start_time)  # takes about 1 second

# Evaluate log p(x | y)
start_time = time.time()
log_p = flow(y).log_prob(x)
print(time.time() - start_time)  # takes about 0.014 seconds

Thanks a lot for your help!

francois-rozet commented 1 year ago

Hello 👋 By default, all flows are fully autoregressive. However, there is a parameter to choose the number of passes for the inverse transformations. Setting passes=2 is equivalent to coupling. But you can choose anything between 2 and the number of features.

michaeldeistler commented 1 year ago

Got it, thanks for the clarification!

michaeldeistler commented 1 year ago

Got it, thanks for the clarification!

francois-rozet commented 1 year ago

Do you think this should be better explained in the documentation?

michaeldeistler commented 1 year ago

Yes, I think this would be helpful. Maybe either add an example or extend the docstring like:

passes: The number of sequential passes for the inverse transformation. If None, use the number of features (making the network fully autoregressive). `passes=2` corresponds to coupling layers.

I had Ctrl+Fed the API docs for the word coupling and could not find anything ;)

It might also be worth adding to the NSF docstring sth like:

Note that, by default, the conditioner is fully autoregressive. Coupling layers (as used in Durkan et al. 2020) can be used by setting `passes=2`.
francois-rozet commented 1 year ago

The proposed changes have been added in the last release 🥳