FoundationVision / VAR

[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!
MIT License
4.03k stars 302 forks source link

is it possible to get embedding of last layer of network and add softmax to get a Classifier #21

Closed roodkcab closed 5 months ago

keyu-tian commented 5 months ago

hi @roodkcab, maybe you can add a classifier linear layer in VAR byself.head2 = nn.Linear(self.C, classification_classes) and replace self.head(self.head_nm(h.float(), cond_BD).float()).float() with self.head2(self.head_nm(h.float(), cond_BD).float().mean(dim=1)).float() in models/var.py to get your classification logits.

roodkcab commented 5 months ago

great! I'll have a look. really impressive work! thanks again.