dome272 / VQGAN-pytorch

Pytorch implementation of VQGAN (Taming Transformers for High-Resolution Image Synthesis) (https://arxiv.org/pdf/2012.09841.pdf)
MIT License
443 stars 73 forks source link

The issue with training results. #10

Open JunZhan2000 opened 1 year ago

JunZhan2000 commented 1 year ago

Hello, thank you very much for your code and videos! I'm using this code repository to train on the flowers dataset with a batch size of 32 for 200 epochs, but the reconstructed images still only have rough outlines without specific details. Is there something wrong somewhere?

image
githuboflk commented 10 months ago

@junzhan18 I have same question. Have you solved it yet?

aa1234241 commented 9 months ago

42_1000 replace the model code with the official git the problem solved.

githuboflk commented 9 months ago

42_1000 replace the model code with the official git the problem solved.

Can you share the official git? @aa1234241

aa1234241 commented 9 months ago

githuboflk

https://github.com/CompVis/taming-transformers

githuboflk commented 9 months ago

githuboflk

https://github.com/CompVis/taming-transformers

@aa1234241 Thanks for you reply. Just replace the model code and leave the rest unchanged?

aa1234241 commented 9 months ago

githuboflk

https://github.com/CompVis/taming-transformers

@aa1234241 Thanks for you reply. Just replace the model code and leave the rest unchanged?

Screen Shot 2023-12-28 at 4 05 37 PM

I replaced all these components by the official code and it works

abc2308 commented 9 months ago

githuboflk

https://github.com/CompVis/taming-transformers

@aa1234241 Thanks for you reply. Just replace the model code and leave the rest unchanged?

Screen Shot 2023-12-28 at 4 05 37 PM I replaced all these components by the official code and it works

Thank you for your prompt, I encountered some problems in the process of change, can I disclose your code?

cxk0703 commented 9 months ago

githuboflk

https://github.com/CompVis/taming-transformers

@aa1234241 Thanks for you reply. Just replace the model code and leave the rest unchanged?

Screen Shot 2023-12-28 at 4 05 37 PM I replaced all these components by the official code and it works

@aa1234241 Thanks for the tip. How many rounds did you train? I replaced the model after 200 rounds of training in 800 images and still only contours

aa1234241 commented 9 months ago

githuboflk

https://github.com/CompVis/taming-transformers

@aa1234241 Thanks for you reply. Just replace the model code and leave the rest unchanged?

Screen Shot 2023-12-28 at 4 05 37 PM I replaced all these components by the official code and it works

@aa1234241 Thanks for the tip. How many rounds did you train? I replaced the model after 200 rounds of training in 800 images and still only contours 145_0 here is my result training after 145 epochs on the flower dataset. Its not perfect and I'm still training. You can double check the code. I've replaced the vqgan model, discriminator model and perceptual loss.

cxk0703 commented 9 months ago

githuboflk

https://github.com/CompVis/taming-transformers

@aa1234241 Thanks for you reply. Just replace the model code and leave the rest unchanged?

Screen Shot 2023-12-28 at 4 05 37 PM I replaced all these components by the official code and it works

@aa1234241 Thanks for the tip. How many rounds did you train? I replaced the model after 200 rounds of training in 800 images and still only contours 145_0 here is my result training after 145 epochs on the flower dataset. Its not perfect and I'm still training. You can double check the code. I've replaced the vqgan model, discriminator model and perceptual loss.

@aa1234241 The effect looks very good, I changed the code according to what you said, but the effect is still not good at present, I think I made some mistakes, can you give me your code? thank you very much.

aa1234241 commented 9 months ago

githuboflk

https://github.com/CompVis/taming-transformers

@aa1234241 Thanks for you reply. Just replace the model code and leave the rest unchanged?

Screen Shot 2023-12-28 at 4 05 37 PM I replaced all these components by the official code and it works

@aa1234241 Thanks for the tip. How many rounds did you train? I replaced the model after 200 rounds of training in 800 images and still only contours 145_0 here is my result training after 145 epochs on the flower dataset. Its not perfect and I'm still training. You can double check the code. I've replaced the vqgan model, discriminator model and perceptual loss.

@aa1234241 The effect looks very good, I changed the code according to what you said, but the effect is still not good at present, I think I made some mistakes, can you give me your code? thank you very much.

Sorry, I can't upload the code since it violates company policy. I recommend that you debug both the official VQGAN code and your own. Identify where the outputs diverge. In my case, I directly replaced the VQGAN model, the discriminator model, and the perceptual loss. I also addressed the visualization issue while leaving everything else unchanged. You could also first disable the gan loss, treat the model as a VQVAE and see the results.

49_1000 This is the VQVAE results after 50 epochs, make sure this step is visually coherent. After that you could add the gan loss, and the results will be better.

cxk0703 commented 9 months ago

githuboflk

https://github.com/CompVis/taming-transformers

@aa1234241 Thanks for you reply. Just replace the model code and leave the rest unchanged?

Screen Shot 2023-12-28 at 4 05 37 PM I replaced all these components by the official code and it works

@aa1234241 Thanks for the tip. How many rounds did you train? I replaced the model after 200 rounds of training in 800 images and still only contours 145_0 here is my result training after 145 epochs on the flower dataset. Its not perfect and I'm still training. You can double check the code. I've replaced the vqgan model, discriminator model and perceptual loss.

@aa1234241 The effect looks very good, I changed the code according to what you said, but the effect is still not good at present, I think I made some mistakes, can you give me your code? thank you very much.

Sorry, I can't upload the code since it violates company policy. I recommend that you debug both the official VQGAN code and your own. Identify where the outputs diverge. In my case, I directly replaced the VQGAN model, the discriminator model, and the perceptual loss. I also addressed the visualization issue while leaving everything else unchanged. You could also first disable the gan loss, treat the model as a VQVAE and see the results.

49_1000 This is the VQVAE results after 50 epochs, make sure this step is visually coherent. After that you could add the gan loss, and the results will be better.

@aa1234241 Thank you very much,I will try.

aa1234241 commented 9 months ago

Update. Here is the result of VQGAN trained 300epochs. 309_0 miniGPT trained 500 epochs. transformer_557 And the sampling result. transformer_1

2022yingjie commented 7 months ago

githuboflk

https://github.com/CompVis/taming-transformers

@aa1234241 Thanks for you reply. Just replace the model code and leave the rest unchanged?

Screen Shot 2023-12-28 at 4 05 37 PM I replaced all these components by the official code and it works

@aa1234241 Thanks for the tip. How many rounds did you train? I replaced the model after 200 rounds of training in 800 images and still only contours 145_0 here is my result training after 145 epochs on the flower dataset. Its not perfect and I'm still training. You can double check the code. I've replaced the vqgan model, discriminator model and perceptual loss.

@aa1234241 The effect looks very good, I changed the code according to what you said, but the effect is still not good at present, I think I made some mistakes, can you give me your code? thank you very much.

Sorry, I can't upload the code since it violates company policy. I recommend that you debug both the official VQGAN code and your own. Identify where the outputs diverge. In my case, I directly replaced the VQGAN model, the discriminator model, and the perceptual loss. I also addressed the visualization issue while leaving everything else unchanged. You could also first disable the gan loss, treat the model as a VQVAE and see the results. 49_1000 This is the VQVAE results after 50 epochs, make sure this step is visually coherent. After that you could add the gan loss, and the results will be better.

@aa1234241 Thank you very much,I will try.

Hi, Did you solve your problem? Does it work after your replacement~~

SnakeOnex commented 1 month ago

Anyone in here also managed to get such good results as @aa1234241 ? I am trying the VQ-VAE appoarch (removing GAN, replacing the model & LPIPS with code for the original repo). Got these results after 150 epochs: 170_400

Here is my fork with the changes

aa1234241 commented 1 month ago

Hello everyone, I've made my changes publicly available at https://github.com/aa1234241/vqgan.

aa1234241 commented 1 month ago

Anyone in here also managed to get such good results as @aa1234241 ? I am trying the VQ-VAE appoarch (removing GAN, replacing the model & LPIPS with code for the original repo). Got these results after 150 epochs: 170_400

Here is my fork with the changes

seems like the lpips's bug