code details - Githubissues

dvlab-research / LLaMA-VID

Official Implementation for LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models

Apache License 2.0

622 stars 39 forks source link

Closed Nastu-Ho closed 2 months ago

Nastu-Ho commented 2 months ago

I am curious if the lm_head here is correctly initialized with pre-trained weights?