Why using padding_side='right' during training?

RUCAIBox / LC-Rec

[ICDE'24] Code of "Adapting Large Language Models by Integrating Collaborative Semantics for Recommendation."

62 stars 4 forks source link

Why using padding_side='right' during training? #6

Open YuZhang10 opened 7 months ago

YuZhang10 commented 7 months ago

Hi, I noticed you use padding_side='right' in training while 'left' in eval. In my previous experience, padding_side is usually set to 'left' for generation models. (as stated in this link ) Looking forward to your reply~Thanks in advance.

zhengbw0324 commented 7 months ago

@YuZhang10 Hello, the position encoding used by LLaMA is RoPE, a relative position encoding, so there is no difference whether left padding or right padding is used during training. However, during the autoregressive generation process, each new token generated is added to the end of the sentence. If you use the right padding, it will be added after the pad token, which is unreasonable. Therefore, be sure to use left padding during the inference process.