FoundationVision / VAR

[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!
MIT License
4.03k stars 302 forks source link

training code #22

Open Jiawei-Yang opened 5 months ago

Jiawei-Yang commented 5 months ago

Hi there,

thanks for your amazing work! I noticed the recent updates covered the training code. Will there also be updates about training scripts and data preparation? I'm particularly interested in training the VQVAE part.

iFighting commented 5 months ago

Jiawei-Yang

@Jiawei-Yang thanks for your kindly attention, we will release the training code of VQVAE in several days

shliu0 commented 5 months ago

Jiawei-Yang

@Jiawei-Yang thanks for your kindly attention, we will release the training code of VQVAE in several days

Hello, I am interested in replicating your vqvae results and would appreciate some additional details to help me understand and possibly extend your work. Could you please provide more specifics regarding the training configuration? e.g. exact loss function equation used during training and weights for each term, batch size, lr, etc. Thanks for your good work!

pkulwj1994 commented 5 months ago

Also very interested in the training of VQ-tokenizer part, looking forward! BTW: VAR is a super cool work!