Closed taki0112 closed 6 years ago
oh I found it ! (exps/*yaml) I have some questions
scale
means data augmentation ? GaussianVAE2D
no use in Image translation (Appendix A)... where did you do ?@taki0112
scale
means data augmentation ?
No. It is the scale factor that we apply to all the input images. It is kept fixed through training. For example, if your images are all with size 1024x1024 and you set the scale to 0.5, then all the images will be scaled to have a size of 512x512 before feeding to the networks.
I think GaussianVAE2D
no use in Image translation (Appendix A)... where did you do ?
You can use the layer. It will render similar results. For saving some memory space, I used a reduced implementation called GaussianLayer. It is like setting the learnable variance parameters in GaussianVAE2D to 1.
I can't see GaussianLayer
... you mean GaussianNoiseLayer ?
see this code
There is no GaussianLayer
If you mean GaussianLayer
== GaussianNoiseLayer
,, It is just sum a noise
.. ! right ?
That’s correct.
um.. ! I have one more question .. ! sorry... !
In your paper, Appendix A.. you use 1x1 conv in Resblock ... (N512, K1, S1) but, In your code, you did not 1x1 conv you use 3x3 conv (stride=2) link
What do I have to use to get the results of your paper? 1x1 ? or 3x3 ?
Oops. This is a typo in the paper. Will fix it. The code is correct. It should be 3x3. K3, S1.
goooooooood!
Thank you
Hi
When I see this code
There is two Resblock
What did you use?
I use COCOResGen2 but there is not much different in terms of results.
In COCOResGen,
n_gen_front_blk-1
for i in range(0, n_gen_front_blk-1):
decA += [ReLUINSConvTranspose2d(tch, tch//2, kernel_size=3, stride=2, padding=1, output_padding=1)]
decB += [ReLUINSConvTranspose2d(tch, tch//2, kernel_size=3, stride=2, padding=1, output_padding=1)]
tch = tch//2
so It make 2 TransposeConv (and one Transconv with TanH)
but, all your *yaml.. n_gen_front_blk
=3
but In your paper, The number of TransposeConv = 3 and TransposeConv with TanH = 1
It also typo ?
@taki0112 Thanks for tracing the code and you are right. I am modifying my answer here. There are 3 Transposed Convolutional layers, 2 are created in the for loop and 1 is created before the TanH. I count the one before TanH as a transposed convolutional layer even though the stride is 1. The structure should be DCONV-(N256,K3,S2), LeakyReLU DCONV-(N128,K3,S2), LeakyReLU DCONV-(N3,K1,S1),TanH
I also found that I have a typo for the encoders. The architecture should be CONV-(N64,K7,S1)
# Convolutional back-end
for i in range(0, n_gen_front_blk-1):
decA += [LeakyReLUConvTranspose2d(tch, tch//2, kernel_size=3, stride=2, padding=1, output_padding=1)]
decB += [LeakyReLUConvTranspose2d(tch, tch//2, kernel_size=3, stride=2, padding=1, output_padding=1)]
tch = tch//2
decA += [nn.ConvTranspose2d(tch, input_dim_a, kernel_size=1, stride=1, padding=0)]
decB += [nn.ConvTranspose2d(tch, input_dim_b, kernel_size=1, stride=1, padding=0)]
decA += [nn.Tanh()]
decB += [nn.Tanh()]
In your code, LeakyReLUConvTranspose2d -> LeakyReLUConvTranspose2d -> LeakyReLUConvTranspose2d(Tanh)
However, In your paper DCONV-(N256,K3,S2), LeakyReLU DCONV-(N128,K3,S2), LeakyReLU DCONV-(N64,K3,S2), LeakyReLU DCONV-(N3,K1,S1),TanH
Different
And one more question
Why did you do output_padding
? can you answer it ?
Discriminator
n_layer
= 6
def _make_net(self, ch, input_dim, n_layer):
model = []
model += [LeakyReLUConv2d(input_dim, ch, kernel_size=3, stride=2, padding=1)] #16
tch = ch
for i in range(1, n_layer):
model += [LeakyReLUConv2d(tch, tch * 2, kernel_size=3, stride=2, padding=1)] # 8
tch *= 2
model += [nn.Conv2d(tch, 1, kernel_size=1, stride=1, padding=0)] # 1
return nn.Sequential(*model)
In your code, LeakyReLUConv2d -> LeakyReLUConv2d* 5 -> Conv2d(kernel=1, stride=1)
However, In your paper LeakyReLUConv2d -> LeakyReLUConv2d* 4 -> Conv2d(kernel=2, stride=1)
Different
Is it also Typo ?
I'm sorry that there are too many questions. I am so interested in your paper. I would be grateful if you could understand my situation.
Why did you do output_padding ? can you answer it ?
For make sure the output image size is correct. I do not know if Tensorflow and PyTorch handle padding in the same way.
If there is a different between the code and paper, most likely the code should be correct.
@taki0112 For the discriminator, please use COCOMsDis. This multi-scale one works better for most of the cases.
Hi ! I am reproducing your code with tensorflow. But I do not know the current hyperparameter information. (batch size, input size, dropout rate, etc.) Could you tell me which code I can check?