Open xc-7984 opened 5 years ago
Even when K=2, our dynamic linear transformation is different from affine coupling layer, discussed in Section 3.1.
We found K=4, 6 for inverse dynamic linear transformation is also worse than K=2 of inverse dynamic linear transformation, so we didn't discussed it in our paper due to space constraints.
Conform it by following test if you're interested:
python main.py --results_dir results/cifar10_noCond_4parts --num_parts 4 --width 308 --decomposition 1
python main.py --results_dir results/cifar10_noCond_6parts --num_parts 6 --width 256 --decomposition 1
So the best K is 2?When k=2,Glow is h(x1)=x1,while yours is h(x1) = s1*x1+u1.Only changing this can make the results better than Glow on the Imagenet dataset?I amd confused about that.
Yeah, it turns out our best results are obtained by changing y1 = x1 in affine coupling layer to y1 = s1*x1 + u1 (Actnorm layer likes). This is reasonable. In affine coupling layer, there always a half remains unchanged, it could be a bias.
So if i replace the dynamic linear transform with a affine coupling layer and a actnorm layer,the result should be better.Glow consists of a affine coupling layer and a actnorm layer each step.I still don't understand why your model better than Glow on the Imagenet dataset.
Hello, I just wanted to follow up on this.
I feel as if I'm missing something important here. When K=2, is your model exactly the same as Glow, except for the fact that in the affine coupling layer, you have h(x_1) = s_1*x_1+u_1
instead of h(x_1)=x_1
in Glow?
@lukemelas The changes in our best case (K=2) compared to Glow can be concluded as three points:
I think our other novel contributions are also important:
Thanks for the quick and thorough response!
Your response also helps me.
In figure2 of your paper,you show K=2 is the better choice of K,so is there any different of your model with Glow when K=2? And when k=4 or 6,what's the result of inverse dynamic linear transformation.