FoundationVision / LlamaGen

Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
https://arxiv.org/abs/2406.06525
MIT License
1.34k stars 55 forks source link

Embbeding layer #66

Closed wangyf8848 closed 1 month ago

wangyf8848 commented 1 month ago

May I ask why VQVAE's learned codebook was not used as the Embedding layer for autoregressive training of language models, but only the indices of the codebook were used, and a new learnable Embedding layer was created for GPT as bellow: self.tok_embeddings = nn.Embedding(config.vocab_size, config.dim)