NVlabs / NVAE

The Official PyTorch Implementation of "NVAE: A Deep Hierarchical Variational Autoencoder" (NeurIPS 2020 spotlight paper)
https://arxiv.org/abs/2007.03898
Other
1.02k stars 164 forks source link

Recontructed images visualization #16

Open elsaschalck opened 3 years ago

elsaschalck commented 3 years ago

Hello, I'm especially interested in the reconstruction of the images given as input to the model, and I would like to get and visualize the reconstructed images. To do so, is it the right way to use output_img = output.sample() after logits, log_q, log_p, kl_all, _ = model(x) and output = model.decoder_output(logits) in the train and test functions ? Would you recommend to take several samples for the same image ? Thank you for the work and the release, Elsa

arash-vahdat commented 3 years ago

Hi Elsa,

Sorry for my slow reply. Yes for colored images output.sample() is a good way of obtaining images. Are you planning on computing metrics like FID score on these samples? When we sample from the decoder using the method above, there is a very small amount of noise that is usually not visible to us but it can badly hurt the FID score.

Regards, -- Arash

elsaschalck commented 3 years ago

Hi Arash, Thank you for your answer and for the details. I am not planning to compute metrics like FID score on the images for the moment. I am focusing on image reconstruction instead of image generation. I would like to compare the output image to the input image during training and testing. My final goal is to study how well the network is able to reconstruct an given input image, for instance through the use of a metric like MSE. Therefore, how can I get the final reconstructed image for a given input ? Kind regards, Elsa

arash-vahdat commented 3 years ago

For that purpose too, output.sample() should work. As you have noticed we record reconstructed images in this line: https://github.com/NVlabs/NVAE/blob/master/train.py#L202

elsaschalck commented 3 years ago

Great, thank you, I was looking for this specific line. Best regards, Elsa