ziqipang / LM4VisualEncoding

[ICLR 2024 (Spotlight)] "Frozen Transformers in Language Models are Effective Visual Encoder Layers"
https://arxiv.org/abs/2310.12973
MIT License
210 stars 6 forks source link

Some questions about ViT-Small-LLaMA #4

Closed 1090h2400 closed 8 months ago

1090h2400 commented 8 months ago

Hi sir! Thanks for your great work. I would like to know if the checkpoint about ViT-Small-LLaMA includes the LLaMA model.

ziqipang commented 8 months ago

It includes the LLaMA layer we used (the 32nd of LLaMA), corresponding to self.llama in the state_dict.