FoundationVision / VAR

[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!
MIT License
3.78k stars 285 forks source link

换个角度,ms codebbok是不是也可以等价于另一种latent diffusion? #72

Open YilanWang opened 3 weeks ago

YilanWang commented 3 weeks ago

感谢好文,我在想,以256x256为例,从[1,2,.....,16]的codebook size,在计算的时候也是resize到16计算残差,这种resize之后的信息,是不是也可以等价于latent diffusion?从大小为1的最粗糙的信息开始到16,这不就是一个类似diffusion的coarse to fine的过程吗?