The size of the vocab is inconsistent with the output dimension of the lm_head

kuleshov-group / caduceus

Bi-Directional Equivariant Long-Range DNA Sequence Modeling

Apache License 2.0

148 stars 23 forks source link

The size of the vocab is inconsistent with the output dimension of the lm_head #17

Closed jaminzzz closed 5 months ago

jaminzzz commented 5 months ago

Hello, awesome work！

I'm confused as to why the vocabulary size (12) and the output dimension of lm head (16) are inconsistent?

Looking forward to your reply！

yair-schiff commented 5 months ago

This is because the Mamba/Caduceus config has a parameter named pad_vocab_size_multiple, which pads the vocab to a multiple of some number. We set this value to 8, hence the model embeddding / LM head output get expanded from 12 to 16.