AntixK / PyTorch-VAE

A Collection of Variational Autoencoders (VAE) in PyTorch.
Apache License 2.0
6.44k stars 1.05k forks source link

Image size of 128 and 256 #49

Closed working12 closed 2 years ago

working12 commented 2 years ago

I have checked previous issues with image size problem. You mentioned that model = *VAE*(<in_chanels>, <latent_dim>, hidden_dims=[16, 32, 64, 128, 256, 512]) doing this would increase image size to 128. To make it 256 should we add [8, 16, 32, 64, 128, 256, 512] as well?

Also why doing this changes image size? I don't understand. I have known this from this issue https://github.com/AntixK/PyTorch-VAE/issues/29

GabrielDornelles commented 2 years ago

I'm my usage, in order to increase the image size I had to change some linear layers. Since the model is designed for 64x64, linear layer like fc_mu and fc_var and the decoder_input wont match by matrix algebra, so I just scaled it exponentially in order to run (then in order to work you have of course to increase the latent space). Example, to make 128:

# in model __init__
self.fc_mu = nn.Linear(hidden_dims[-1]*(4*4), latent_dim) # before just 4
self.fc_var = nn.Linear(hidden_dims[-1]*(4*4), latent_dim) # before just 4
self.decoder_input = nn.Linear(latent_dim, hidden_dims[-1] * 16) # before just 4

# in decode method
result = result.view(-1, 512, 2*2, 2*2) # before just 2, now 2*2

As if you want 256, just multiply them again by 4. The reason that doing this changes the image size, is because it allows the convolution layers to be in correct shape to perform matrix algebra with the linear layers. You could either do the way you said, by adding hidden dims so the image keeps bigger in the end of convolutions, or by changing it directly in the linear layers size, like I did.