Open juntang-zhuang opened 5 years ago
You're righ, bijection means unchanged dimension. It's squeezing and splitting operations in multi-scale arch change the dimension of x. But you misunderstood the number of channels in table.1 (512,128 etc.). These numbers describe the channel of first two layers of g(), which is a three convolutional layers for learning (logs_k, muk) given input x{k-1}: (logs_k, muk) = g(x{k-1}). The final output logs_k and mu_k have the same dimension of x_k, given the last layer of g() a right channel.
Got it, thanks for the clarification. By the way, do you have any plan to release a pytorch version? It would be great if written in pytorch. Thanks again for your reply.
We have no plan for a pytorch version, as I'm busy on DLF v2.0, aiming to improve the multi-scale archtecture. Maybe will release DLF v2.0 with a pytorch version.
Thanks for your nice paper and implementation. I read your paper and have a question regarding the channel number, appreciate it if you can help.
If I understand correctly, transformations in a normalizing flow should be bijective, only in that case the change of variable formula works. A bijective transform does not change the dimension, which means the total number of pixels in a tensor cannot be changed (except the split operation drops half of the tensor on purpose). In this case, since you are using squeeze operation, where the dimension is halved but channel increases by 4, thus the total number of pixels in a tensor does not change. But how do you arbitrarily choose 512 (128) channels in your experiment? The channel number in level L should be 3x4^(L-1) (512%3 != 0). I might misunderstand your model, please help me clarify. Thanks a lot.