FoundationVision / VAR

[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!
MIT License
4.3k stars 316 forks source link

[New!] Try ImageFolder🚀 tokenizer for next-scale AR with shorter tokens length🔥, faster speed 🔥 and better FID🔥 #88

Open lxa9867 opened 1 month ago

lxa9867 commented 1 month ago

Project page

We aim to create a reproducible tokenizer codebase for next-scale AR prediction. image

iFighting commented 1 month ago

thanks,i will read this paper carefully. maybe we can have a discussion for this.