baaivision / Emu3

Next-Token Prediction is All You Need
Apache License 2.0
1.81k stars 71 forks source link

Question on the vocabulary size #26

Open PPPPPsanG opened 1 month ago

PPPPPsanG commented 1 month ago

Emu3 is a good work, but i have some question on it. The vocabulary size of Qwen is 152064 , while the codebook size of vision tokenizer employed in Emu3 is 32768 The addation is 184832, the vocabulary size reported in Emu3 is 184622. Why do the numbers not match?

ryanzhangfan commented 3 weeks ago

We use the vacab.json in Qwen2 which have 151643 tokens, plus 32768 vision tokens, 205 extra tokens and 6 special tokens, making the total vocabulary size of 184622.