Closed JumpingRain closed 10 months ago
我也遇到了一样的问题。
发现是batch推理的问题,这个怎么解决呢?model.generate的时候如果指定了num_return_sequences参数就会报上面的错。比如:
pred = model.generate(inputs=input_ids, do_sample=True, max_new_tokens=100, min_new_tokens=10, num_return_sequences=5)
但是把这个参数去掉就正常了,比如:
pred = model.generate(inputs=input_ids, do_sample=True, max_new_tokens=100, min_new_tokens=10)
这就导致模型推理只能得到一个答案,没法得到多个答案。是哪里有问题吗?
感谢反馈,已修复
是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?
当前行为 | Current Behavior
在使用测评https://github.com/EleutherAI/lm-evaluation-harness 时出现问题,但是和测评框架无关。
提示 causal_mask 为 NoneType。
0%| | 0/1319 [00:00<?, ?it/s]/home/admin/.conda/envs/llm_eval/lib/python3.8/site-packages/transformers/generation/configuration_utils.py:367: UserWarning:
do_sample
is set toFalse
. However,top_p
is set to0.8
-- this flag is only used in sample-based generation modes. You should setdo_sample=True
or unsettop_p
. warnings.warn( /home/admin/.conda/envs/llm_eval/lib/python3.8/site-packages/transformers/generation/configuration_utils.py:377: UserWarning:do_sample
is set toFalse
. However,top_k
is set to0
-- this flag is only used in sample-based generation modes. You should setdo_sample=True
or unsettop_k
. warnings.warn(0%| | 1/1319 [00:03<1:15:50, 3.45s/it] Traceback (most recent call last): File "main.py", line 95, in
main()
File "main.py", line 60, in main
results = evaluator.simple_evaluate(
File "/apsara/TempRoot/Odps/ai_business_intern_dev_20231205041552862gsv1p7g88w6_017ea661_b7f0_44a2_9c05_a6a40705f0f4_AlgoTask_0_0/PyTorchWorker@#0/workspace/src/lm-evaluation-harness/lm_eval/utils.py", line 243, in _wrapper
return fn(*args, kwargs)
File "/apsara/TempRoot/Odps/ai_business_intern_dev_20231205041552862gsv1p7g88w6_017ea661_b7f0_44a2_9c05_a6a40705f0f4_AlgoTask_0_0/PyTorchWorker@#0/workspace/src/lm-evaluation-harness/lm_eval/evaluator.py", line 100, in simple_evaluate
results = evaluate(
File "/apsara/TempRoot/Odps/ai_business_intern_dev_20231205041552862gsv1p7g88w6_017ea661_b7f0_44a2_9c05_a6a40705f0f4_AlgoTask_0_0/PyTorchWorker@#0/workspace/src/lm-evaluation-harness/lm_eval/utils.py", line 243, in _wrapper
return fn(*args, *kwargs)
File "/apsara/TempRoot/Odps/ai_business_intern_dev_20231205041552862gsv1p7g88w6_017ea661_b7f0_44a2_9c05_a6a40705f0f4_AlgoTask_0_0/PyTorchWorker@#0/workspace/src/lm-evaluation-harness/lm_eval/evaluator.py", line 295, in evaluate
resps = getattr(lm, reqtype)([req.args for req in reqs])
File "/apsara/TempRoot/Odps/ai_business_intern_dev_20231205041552862gsv1p7g88w6_017ea661_b7f0_44a2_9c05_a6a40705f0f4_AlgoTask_0_0/PyTorchWorker@#0/workspace/src/lm-evaluation-harness/lm_eval/models/huggingface.py", line 469, in greedy_until
responses = self._model_generate(
File "/apsara/TempRoot/Odps/ai_business_intern_dev_20231205041552862gsv1p7g88w6_017ea661_b7f0_44a2_9c05_a6a40705f0f4_AlgoTask_0_0/PyTorchWorker@#0/workspace/src/lm-evaluation-harness/lm_eval/models/huggingface.py", line 536, in _model_generate
generations = self.model.generate(
File "/home/admin/.cache/huggingface/modules/transformers_modules/modeling_qwen.py", line 1261, in generate
return super().generate(
File "/home/admin/.conda/envs/llm_eval/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(args, kwargs)
File "/home/admin/.conda/envs/llm_eval/lib/python3.8/site-packages/transformers/generation/utils.py", line 1606, in generate
return self.greedy_search(
File "/home/admin/.conda/envs/llm_eval/lib/python3.8/site-packages/transformers/generation/utils.py", line 2454, in greedy_search
outputs = self(
File "/home/admin/.conda/envs/llm_eval/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, kwargs)
File "/home/admin/.conda/envs/llm_eval/lib/python3.8/site-packages/accelerate/hooks.py", line 165, in new_forward
output = module._old_forward(*args, *kwargs)
File "/home/admin/.cache/huggingface/modules/transformers_modules/modeling_qwen.py", line 1045, in forward
transformer_outputs = self.transformer(
File "/home/admin/.conda/envs/llm_eval/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(args, kwargs)
File "/home/admin/.cache/huggingface/modules/transformers_modules/modeling_qwen.py", line 893, in forward
outputs = block(
File "/home/admin/.conda/envs/llm_eval/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, kwargs)
File "/home/admin/.conda/envs/llm_eval/lib/python3.8/site-packages/accelerate/hooks.py", line 165, in new_forward
output = module._old_forward(*args, *kwargs)
File "/home/admin/.cache/huggingface/modules/transformers_modules/modeling_qwen.py", line 612, in forward
attn_outputs = self.attn(
File "/home/admin/.conda/envs/llm_eval/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(args, kwargs)
File "/home/admin/.conda/envs/llm_eval/lib/python3.8/site-packages/accelerate/hooks.py", line 165, in new_forward
output = module._old_forward(*args, **kwargs)
File "/home/admin/.cache/huggingface/modules/transformers_modules/modeling_qwen.py", line 524, in forward
-1, -1, causal_mask.size(2), -1
AttributeError: 'NoneType' object has no attribute 'size'
期望行为 | Expected Behavior
正常完成测评
复现方法 | Steps To Reproduce
No response
运行环境 | Environment
备注 | Anything else?
No response