Closed rmehta1987 closed 4 years ago
The original implementation includes an actnorm after all but the final conv in the coupling layer. (see (here the code used in the intermediate convs of the coupling layer). I didn't run this with and without actnorm inside the coupling layer, but my guess is given the depth of the flow and batch size at larger resolutions, normalizing preactivations everywhere helped stability.
I haven't run comparison with original code, but does have actNorm layer in the coupling layer improve performance ?
https://github.com/kamenbliznashki/normalizing_flows/blob/71e630ced23c33ff34db841da3423a6f18bee5bf/glow.py#L196