Closed cteant closed 1 year ago
In models/ant_torch.py when calling inference function, we have
input_prompt = input[:, : self.prompt_length].contiguous()
This seems to suggect that one should pad the tensor using padding_side="right", otherwise the above code will get wrong prompt embedding. However, changing padding_side="right" in generation/ant.py cannot fix the problem. In that case, it seems the attention_mask generated is wrong.
Hi,
This issue has been resolved in CPM-Ant-Plus (see #148). Now, we also fix this in #370 for CPM-Ant, keeping the released checkpoints untouched. Please check the latest code.
thanks : )
I am adapting CPM-Ant to a NLG task using LoRA. I find there maybe a bug in the code. When tuning, the padding_side is set to "right", e.g. in tune.py:
however, when generation, the padding_size is set to "left", e.g., in generation/ant.py:
This inconsistency may degrade the performance of model when conducting model inference with batch size larger than 1.