Closed fei1998 closed 2 years ago
Hi @fei1998 ,
Have you trained P-VAVAE on your dataset? The reconstruction of input data is something wrong. Maybe you can provide your training logs of P-VQVAE.
Thanks a lot! P-VQVAE on my dataset is shown as the following figures. It seems to be wrong? Best wishes!
Hi @fei1998 ,
The curves are some different with mines. According to my experience, the reference image should be mostly masked. It is because the reference branch in the decoder is very easy to be learned. However, the codebooks, the encoder and the left part in the decoder are much harder to be learned. So my suggestion is that increase the mask ratio of the training data and retrain P-VQVAE.
By the way, the reference branch is infact to increase the overall quality of the model. You can even remove it. If you do so, the output image is totally reconstructed with encoded quantize tokens. This is very helpful to check if current configuration of P-VQVAE suitable for your dataset.
Best wishes.
Hi, Thanks a lot! Let me try again. Would you like to provide the pretrained P-VQVAE model on ImageNet? Maybe I can finetune it on my dataset. Best wishes.
Hi @fei1998 , the pretrained models are all provided since P-VQVAE is a sub-module of PUT. You can get it from our provided model with model.content_codec
.
Thanks for sharing your great work! During training, i meet the issues like following figures. The loss remains 0 and the reconstruction image is wrong. How to solve them? Thanks! Best wishes!