FoundationVision / VAR

[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!
MIT License
3.78k stars 285 forks source link

question about ema_vocab_hit_SV #67

Open shliu0 opened 1 month ago

shliu0 commented 1 month ago

In quant.py, I saw that you use ema_vocab_hit_SV rather than vocab_hit_V to record codebook hit frequency, what is the reason for doing this?

keyu-tian commented 1 month ago

EMA can make this stat more stable.