intel / llm-on-ray

Pretrain, finetune and serve LLMs on Intel platforms with Ray
Apache License 2.0
103 stars 30 forks source link

Calculate correct input length for every prompt in a single batch #222

Open kira-lin opened 6 months ago

kira-lin commented 6 months ago

After #209 closes, consider to calculate correct input length for every prompt in MultiplePromptInput, as well as generated tokens.

torch.sum(input_ids == tokenizer.pad_token_id, dim=0).tolist()

Doing so can remove pad tokens when calculating benchmark results.