tsb0601 / MMVP

260 stars 7 forks source link

instruction fine-tuning failed. #6

Open shengyuwoo opened 5 months ago

shengyuwoo commented 5 months ago

Hello, thank you very much for your open source contribution. I have a question to ask you. When I was using your project for the second stage of instruction fine-tuning, the program would get stuck without reporting any errors when I used the file parameter zero3.json or zero2.json to configure deepspeed. It just couldn't proceed. However, when I used the file parameter zero3_offload.json to configure deepspeed, the training proceeded normally. Could you please tell me the reason behind this? Have you run it through your open source project?

My training machine has the following hardware configuration: 8xA800 (80G). The deepspeed version is 0.12.6, transformers version is 4.31.0.

tsb0601 commented 5 months ago

Hi Could you please provide more detail of the bug?

Z-MU-Z commented 4 months ago

Hello, I also encountered a problem during fine-tuning. When I used llava_v1_5_mix665k.json for fine-tuning, I found that the model would report an error when encountering QA data without images. Have you encountered the same problem? How should it be solved?