batch inference - Githubissues

THU-ESIS / Chinese-Mistral

Chinese-Mistral: An Efficient and Effective Chinese Large Language Model

Apache License 2.0

24 stars 3 forks source link

batch inference #4

Open x6p2n9q8a4 opened 1 month ago

x6p2n9q8a4 commented 1 month ago

Hi authors,

I want to test the performance of the Mistral7B on the test dataset. Is it only possible to do single sample inference (with model. generate(...))? Are there any methods to accelerate the process?

Thanks

THUchenzhou commented 1 month ago

You can use: for input_ids, output_ids in zip(batched_inputs.input_ids, batched_outputs):

or refer to https://github.com/ggerganov/llama.cpp