Some questions about ViT-Small-LLaMA

ziqipang / LM4VisualEncoding

[ICLR 2024 (Spotlight)] "Frozen Transformers in Language Models are Effective Visual Encoder Layers"

https://arxiv.org/abs/2310.12973

MIT License

210 stars 6 forks source link

Closed 1090h2400 closed 8 months ago

1090h2400 commented 8 months ago

Hi sir！ Thanks for your great work. I would like to know if the checkpoint about ViT-Small-LLaMA includes the LLaMA model.

ziqipang commented 8 months ago

It includes the LLaMA layer we used (the 32nd of LLaMA), corresponding to self.llama in the state_dict.