Open LutingWang opened 4 months ago
Hi~ I don’t understand what is reproducing the result on 224x224. The expected FID is in 256x256.
Hi~ I don’t understand what is reproducing the result on 224x224. The expected FID is in 256x256.
Sorry for the mistake. I was trying to emphasize that the image resolution is not 384x384, but I mistakenly wrote 224.
Hi~ I don’t understand what is reproducing the result on 224x224. The expected FID is in 256x256.
Hi. Thank you for this awesome repo. I have the same issue with the original code that the loss ends around 7.3 after 300 epochs.
Hi, thanks for the excellent work. I'm trying to reproduce the results on 256*256 images. The VQGAN model is reproduced successively, achieving $2.10$ rFID. However, the AR part experiences a significant performance gap. More specifically, I use 8 A100-80G GPU to run the following scripts
The training results are as follows
Is the final loss reasonable? Do you have any idea what the reason might be?
Thanks!