open-compass / T-Eval

[ACL2024] T-Eval: Evaluating Tool Utilization Capability of Large Language Models Step by Step
https://open-compass.github.io/T-Eval/
Apache License 2.0
205 stars 12 forks source link

Can not eval when set batch_size>1 #48

Open dkqkxx opened 5 months ago

dkqkxx commented 5 months ago

Tested 9 samples, left 523 samples, total 532 samples 3%|█████▋ | 15/523 [00:01<00:49, 10.31it/s] Traceback (most recent call last): File "/home/contribute/llm-couple/evaluation/T-Eval/test.py", line 117, in prediction = infer(dataset, llm, args.out_dir, tmp_folder_name=tmp_folder_name, test_num=test_num, batch_size=args.batch_size) File "/home/contribute/llm-couple/evaluation/T-Eval/test.py", line 74, in infer predictions = llm.chat(batch_infer_list, do_sample=False) File "/home/anaconda3/envs/llm-couple/lib/python3.9/site-packages/lagent/llms/base_llm.py", line 191, in chat return self.generate(_inputs, **genparams) File "/home/anaconda3/envs/llm-couple/lib/python3.9/site-packages/lagent/llms/huggingface.py", line 133, in generate for status, chunk, in self.stream_generate(inputs, do_sample, File "/home/anaconda3/envs/llm-couple/lib/python3.9/site-packages/lagent/llms/huggingface.py", line 272, in stream_generate if (unfinished_sequences.max() == 0 RuntimeError: Boolean value of Tensor with more than one value is ambiguous

请问能提供修复方案吗

zehuichen123 commented 3 months ago

请问是用什么model infer的呀 有些model是不支持batch inference的

dkqkxx commented 3 months ago

请问是用什么model infer的呀 有些model是不支持batch inference的

llama2,你们有试过哪些模型支持吗