Why setting LLaMa3's padding direction to "right"?

Hi! Really appreciate your great work.

I'm a bit confused of the padding_direction being set in LLaMA3's tokenizer.json file. As said in the comments, this is used in the model's repack function. Since LLaMA3 is a autoregressive model, why did you choose to pad the embeddings and placeholder labels to the right instead of left?

Also, padding to right raises an issue where the end of the input prompt is difficult to be identified during inference. If I want to finetune the model on my own dataset, will it still work if I change the padding side from right to left? Thanks!

NVlabs / VILA

Why setting LLaMa3's padding direction to "right"? #72