FoundationVision / VAR

[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!
MIT License
4.03k stars 303 forks source link

Question about the shared codebook #34

Closed jiho314 closed 5 months ago

jiho314 commented 5 months ago

Hi, thanks for your wonderful work! I'm studying your work with big interest!

Is there any reason to use shared code book for multiple scales? Intuitively, it seems there might be a possibility of performance gain via seperate codebook. (distinguishing the role of each codebook) I wonder if you've tried unshared codebook.

Again, thank you!

keyu-tian commented 5 months ago

Thanks @jiho314. Although didn't try this, I also believe that an unshared one can yields better results because sharing the codebook would limit the model's expressive power. We use the shared codebook just for convince.

jiho314 commented 5 months ago

Oh I got it. Thanks for your kind reply!