[bug] Inconsistent padding_side when tuning and generation, get wrong results when inference with batch size larger than 1

OpenBMB / CPM-Live

Live Training for Open-source Big Models

511 stars 40 forks source link

[bug] Inconsistent padding_side when tuning and generation, get wrong results when inference with batch size larger than 1 #368

Closed cteant closed 1 year ago

cteant commented 1 year ago

I am adapting CPM-Ant to a NLG task using LoRA. I find there maybe a bug in the code. When tuning, the padding_side is set to "right", e.g. in tune.py:

    padded[key] = pad(items, key, _padding_value, padding_side="right")

however, when generation, the padding_size is set to "left", e.g., in generation/ant.py:

   padded[key] = pad(input_tensors, key, padding_side='left')

This inconsistency may degrade the performance of model when conducting model inference with batch size larger than 1.

cteant commented 1 year ago

In models/ant_torch.py when calling inference function, we have

input_prompt = input[:, : self.prompt_length].contiguous()

This seems to suggect that one should pad the tensor using padding_side="right", otherwise the above code will get wrong prompt embedding. However, changing padding_side="right" in generation/ant.py cannot fix the problem. In that case, it seems the attention_mask generated is wrong.

zh-zheng commented 1 year ago

Hi,

This issue has been resolved in CPM-Ant-Plus (see #148). Now, we also fix this in #370 for CPM-Ant, keeping the released checkpoints untouched. Please check the latest code.

cteant commented 1 year ago

thanks : )