Open Njasa2k opened 3 months ago
Great, Can't wait to try it out!
Will you guys also release the weights? Thanks!
Yes, please wait for a while, thanks.
Will the training code be released
Do you have some evaluation result about the 256x256 VAR like the FID in animal or other case in the MJD-30k
Beside,do you think the result that you generate are darker than usual image ?
We evaluated on ImageNet-val under 256 cases, and the recon PSNR is around 22, which seems to be better than VQGAN. The brightness of the generated dataset is caused by the dataset used for training. Fine-tuning on a small subset will alleviate this issue.
yes,thanks , but I means the VAR not the VAE result, do you have the FID score of the VAR in Per-category FID on MJHQ-30K? And Really thanks for your reply again!
Well, the per-category FID on MJHQ can be found in Fig.2 in our mainpaper, for specific value, refer to table below:
Thanks a lot, I would say that it is really impressive to see that it has such low FID score!!!
Do you guys add the qk normalization
We do not have qk normalization, but probably will in the next version. The visual auto-regression paradigm proposed by VAR is full of potential, and we are currently working on exploring it for more stable and amazing results.
So looking forward to the code and ckpt hhhhh
Hi @krennic999, I was wondering if there were any updates on the release timeline? Was it still scheduled for next week, or have there been any changes or delays? Looking forward to hearing back from you!
Hi, we apologize for the inconvenience. After our discussions, we have determined that the current version, due to issues with VQVAE and other factors, is not stable enough for practical applications. We may release a revised version, including modified VQVAE and 1024 generation later this year.
However, we will do our best to answer any questions about this project. Thank you for your interest.
Is the unstable means although the model can achieve a lower FID score, but the generative image is not as stable as in diffusion model?
I also do my own VAR t2i training. I find that it is somehow like 抽奖
Hi,@krennic999,can I ask when is the approximate release date?
@daiyixiang666 yes, the results are very unstable and there are some issues with generating some details
@Ccioud err... currently we are addressing the issue of VAE, I think we can provide a usable solution by CVPR submission deadline.
What your opinion about the VAE, I think we can have some in depth discussion. The lower scale of the vae reconstruction is really bad
Do you think the share notebook in the VAE is important?
@daiyixiang666, hi, you can send an email to xiao_xiao@mail.ustc.edu.cn and we can discuss further
@krennic999 @Ccioud Could you describe instability of var in detail? I've been doing experiments lately and I'm interested in it.
Will, around a month, we need to follow the company's open-source process.