I'm seeing an unreasonable increase in RAM usage (of order GB) when training a normalising flow with batch norm between layers. I believe this is caused by the computation graph being extended each time the running mean is computed here (this pytorch issue reports a similar problem). This does not appear to be an issue when using batch norm within layers since they use nn.BatchNorm1d.
I will also submit a pull request with minor changes that should fix this (assuming it is indeed a bug).
To reproduce:
Run the following snippet and monitor RAM usage:
import sklearn.datasets as datasets
import torch
from torch import optim
from nflows.flows.realnvp import SimpleRealNVP
flow = SimpleRealNVP(2, 32, 4, 2, batch_norm_between_layers=True)
optimizer = optim.Adam(flow.parameters())
num_iter = 1000
for i in range(num_iter):
x, y = datasets.make_moons(1024, noise=.1)
x = torch.tensor(x, dtype=torch.float32)
optimizer.zero_grad()
loss = -flow.log_prob(inputs=x).mean()
loss.backward()
optimizer.step()
Hi,
I'm seeing an unreasonable increase in RAM usage (of order GB) when training a normalising flow with batch norm between layers. I believe this is caused by the computation graph being extended each time the running mean is computed here (this pytorch issue reports a similar problem). This does not appear to be an issue when using batch norm within layers since they use
nn.BatchNorm1d
.I will also submit a pull request with minor changes that should fix this (assuming it is indeed a bug).
To reproduce:
Run the following snippet and monitor RAM usage: