X-PLUG / mPLUG-Owl

mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
https://www.modelscope.cn/studios/damo/mPLUG-Owl
MIT License
2.32k stars 176 forks source link

mplugowl3在训练的时候如果batchsize>1, mediaoffset的padding方式是什么? #247

Closed goodstudent9 closed 1 month ago

goodstudent9 commented 1 month ago

你好,我正在使用msswift对模型进行微调,由于数据量较大,所以我需要设置batchsize>1的训练,但是目前我无法使用该方法进行训练。 核心原因是变量media_offset没有进行padding操作。我尝试使用0或者-100来对其padding,但是都会报错,具体报错是在selsect_query函数。 我也尝试了将最后一个元素复制到每一个batch的最大序列长度,这样虽然代码正常运行了,但是效果很糟糕,所以这应该不是您最初padding的方式。所以我想请问您最初在怎么在一个批量训练中padding media_offset这个变量的呢?这对我真的很重要,谢谢您的回复!

def select_query(media_offset, num_queries=None):
    query_indices = media_offset[:,:,1]>=0 # B L
    pdb.set_trace()
    assert query_indices.sum().item()%num_queries == 0, query_indices.sum().item()
    query_indices = query_indices.nonzero()
    ptr = 0
    #这是原来的代码
    while ptr < query_indices.shape[0]:
        first_query_index, last_query_index  = query_indices[ptr], query_indices[ptr+num_queries-1]
        assert (last_query_index[1] - first_query_index[1] + 1).item() == num_queries
        assert last_query_index[0].item() == first_query_index[0].item()
        batch_id, begin_i, end_i = first_query_index[0].item(), first_query_index[1].item(), first_query_index[1].item()+num_queries
        yield batch_id, begin_i, end_i

        ptr += num_queries
LukeForeverYoung commented 1 month ago

We open a PR on ms-swift to address this issue. https://github.com/modelscope/ms-swift/pull/2100 Could you please try the latest version of ms-swift and verify if the training works?

goodstudent9 commented 1 month ago

thank you for your response

I have tried new class implementation when batch size is2. it's worked.

Best wishes

---原始邮件--- 发件人: @.> 发送时间: 2024年9月23日(周一) 下午3:35 收件人: @.>; 抄送: @.**@.>; 主题: Re: [X-PLUG/mPLUG-Owl] mplugowl3在训练的时候如果batchsize>1, mediaoffset的padding方式是什么? (Issue #247)

We open a PR on ms-swift fix this issue.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>