jingyaogong / minimind-v

「大模型」3小时从0训练27M参数的视觉多模态VLM,个人显卡即可推理训练!
https://jingyaogong.github.io/minimind-v
Apache License 2.0
382 stars 41 forks source link

sft-vlm 报错 #13

Open MAOJIASONG opened 1 month ago

MAOJIASONG commented 1 month ago

Epoch:[0/19](100/991) loss:10.074 lr:0.0000010 epoch_Time:13.0min:                                                                                                                   
Epoch:[0/19](200/991) loss:9.163 lr:0.0000010 epoch_Time:12.0min:                                                                                                                    
Epoch:[0/19](300/991) loss:8.855 lr:0.0000010 epoch_Time:10.0min:                                                                                                                    
Epoch:[0/19](400/991) loss:8.470 lr:0.0000010 epoch_Time:9.0min:                                                                                                                     
Epoch:[0/19](500/991) loss:8.563 lr:0.0000010 epoch_Time:7.0min:                                                                                                                     
Epoch:[0/19](600/991) loss:8.411 lr:0.0000010 epoch_Time:5.0min:                                                                                                                     
Epoch:[0/19](700/991) loss:8.361 lr:0.0000010 epoch_Time:4.0min:                                                                                                                     
Epoch:[0/19](800/991) loss:7.748 lr:0.0000010 epoch_Time:2.0min:  

[rank3]:   File "/minimind-v/model/model.py", line 391, in count_vision_proj                                                                                    
[rank3]:     before = h[i, :image_indices[i][0], :]                                       
[rank3]: IndexError: list index out of range

sft-vlm训练,在训练途中会出现image_indices 超出 list boundary的情况。

jingyaogong commented 1 month ago

Epoch:[0/19](100/991) loss:10.074 lr:0.0000010 epoch_Time:13.0min:                                                                                                                   
Epoch:[0/19](200/991) loss:9.163 lr:0.0000010 epoch_Time:12.0min:                                                                                                                    
Epoch:[0/19](300/991) loss:8.855 lr:0.0000010 epoch_Time:10.0min:                                                                                                                    
Epoch:[0/19](400/991) loss:8.470 lr:0.0000010 epoch_Time:9.0min:                                                                                                                     
Epoch:[0/19](500/991) loss:8.563 lr:0.0000010 epoch_Time:7.0min:                                                                                                                     
Epoch:[0/19](600/991) loss:8.411 lr:0.0000010 epoch_Time:5.0min:                                                                                                                     
Epoch:[0/19](700/991) loss:8.361 lr:0.0000010 epoch_Time:4.0min:                                                                                                                     
Epoch:[0/19](800/991) loss:7.748 lr:0.0000010 epoch_Time:2.0min:  

[rank3]:   File "/minimind-v/model/model.py", line 391, in count_vision_proj                                                                                    
[rank3]:     before = h[i, :image_indices[i][0], :]                                       
[rank3]: IndexError: list index out of range

sft-vlm训练,在训练途中会出现image_indices 超出 list boundary的情况。

max_seq_len 是200吧? 设置到512