FoundationVision / LlamaGen

Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
https://arxiv.org/abs/2406.06525
MIT License
1.22k stars 49 forks source link

Questions about the results of your experiment. #38

Open potatowarriors opened 3 months ago

potatowarriors commented 3 months ago

hi. I was wondering about the results of your experiments. In section 3.1, in the codebook size ablation study, you show that the usage increases from 75%->97% when the size increases from 8192 to 16384. It's very interesting to see that at 4096 size, the usage is 100%, at 8192 size, the usage drops to 75%, and then rises to 97% at 16384 size. Do you guys have any insights or conclusions on this result?

PeizeSun commented 3 months ago

Hi~ We are now re-running this set of ablation study to see whether these results are reproducible. We will update here soon, please stay tuned.

haha-yuki-haha commented 2 months ago

Hi. I was also wondering the experiments about code book dimension. I have trained the VQGAN with the downsamlping of 4, however, I do not get the similar result. Do you have the similar results when the downsample is 8 or 4? Thanks.