dvlab-research / LLaMA-VID

Official Implementation for LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models
Apache License 2.0
622 stars 39 forks source link

code details #87

Closed Nastu-Ho closed 2 months ago

Nastu-Ho commented 2 months ago

https://github.com/dvlab-research/LLaMA-VID/blob/d1074f3662a772d1b3c723416af59314ba593f67/llamavid/model/language_model/llava_llama_vid.py#L46

I am curious if the lm_head here is correctly initialized with pre-trained weights?