alibaba / Pai-Megatron-Patch

The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.
Apache License 2.0
674 stars 94 forks source link

关于llava适配的问题 #333

Closed divisionblur closed 1 month ago

divisionblur commented 1 month ago

llava好像不支持保存 projection 那两个矩阵。

divisionblur commented 1 month ago

另外,有对qwen2-vl适配的打算吗?

divisionblur commented 1 month ago

llava求loss的时候图像特征embedding最后一个需要和文本的第一个embedding求loss吗?