X-PLUG / mPLUG-Owl

mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
https://www.modelscope.cn/studios/damo/mPLUG-Owl
MIT License
2.25k stars 171 forks source link

stage-1 training #113

Closed fanbooo closed 1 year ago

fanbooo commented 1 year ago

谢谢作者的工作;请问下能否提供下pretrain的脚本? 另外求问这个loss这段能否大概解释一下,论文里似乎也没有特别详细的介绍:

    _, loss_mask, position_ids = get_ltor_masks_and_position_ids_from_embeddings(input_embeds)

    # Calculate the loss_mask
    non_padding_mask = non_padding_mask.long()
    non_media_mask = non_media_mask.long()
    prompt_mask = prompt_mask.long()  # TODO How to deal with prompt mask
    # from icecream import ic
    # non_padding_mask = non_padding_mask[:,:-1]
    # non_media_mask = non_media_mask[:,:-1]
    # prompt_mask = prompt_mask[:,:-1]
    # attention_mask = attention_mask[:,:-1]
    loss_mask = loss_mask[:, :-1]

    loss_mask = loss_mask * non_padding_mask * non_media_mask * prompt_mask
    labels[:, 1:][loss_mask != 1] = -100
    # Forward into GPT
    outputs = self.language_model(
        inputs_embeds=input_embeds,
        attention_mask=attention_mask,
        labels=labels,
        return_dict=return_dict,
        output_attentions=self.config.output_attentions,
    )
    # outputs.loss = (outputs.loss * loss_mask.view(-1)
    #                 ).sum()/loss_mask.sum()
MAGAer13 commented 1 year ago

We only apply LM loss across both stage 1 and stage 2. It just calculate the loss on the response part.