Closed pangpang-xuan closed 6 months ago
请使用 example 文件夹内的 vllm,在官方基础上做了修改,可以跑通。
我就是在examples中的vllm进行推理的 总共需要推理8535条
在推理400多条时候出现了一样的错误
RuntimeError: probability tensor contains either inf
, nan
or element < 0
Processed prompts: 5%|▌ | 429/8535 [00:27<08:36, 15.70it/s]
sampling_params = SamplingParams(temperature=0.95, top_p=0.95) #BlueLM llm=LLM(model=args.model_dir,trust_remote_code=True)
使用官方提供的vllm部署BlueLM,使用的代码和一直出现下面的报错如下
llm = LLM(model=args.model_dir, tokenizer_mode='auto', trust_remote_code=True, max_num_seqs=max_num_seqs, max_model_len=max_model_len, max_num_batched_tokens=max_num_batched_tokens)
File "/home/bingxing2/home/scx6592/.conda/envs/python310/lib/python3.10/site-packages/vllm/entrypoints/llm.py", line 93, in __init__ self.llm_engine = LLMEngine.from_engine_args(engine_args) File "/home/bingxing2/home/scx6592/.conda/envs/python310/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 231, in from_engine_args engine = cls(*engine_configs, File "/home/bingxing2/home/scx6592/.conda/envs/python310/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 113, in __init__ self._init_cache() File "/home/bingxing2/home/scx6592/.conda/envs/python310/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 193, in _init_cache num_blocks = self._run_workers( File "/home/bingxing2/home/scx6592/.conda/envs/python310/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 700, in _run_workers output = executor(*args, **kwargs) File "/home/bingxing2/home/scx6592/.conda/envs/python310/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/home/bingxing2/home/scx6592/.conda/envs/python310/lib/python3.10/site-packages/vllm/worker/worker.py", line 111, in profile_num_available_blocks self.model( File "/home/bingxing2/home/scx6592/.conda/envs/python310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/home/bingxing2/home/scx6592/.conda/envs/python310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/home/bingxing2/home/scx6592/.conda/envs/python310/lib/python3.10/site-packages/vllm/model_executor/models/bluelm.py", line 266, in forward next_tokens = self.sampler(self.lm_head.weight, hidden_states, File "/home/bingxing2/home/scx6592/.conda/envs/python310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/home/bingxing2/home/scx6592/.conda/envs/python310/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/home/bingxing2/home/scx6592/.conda/envs/python310/lib/python3.10/site-packages/vllm/model_executor/layers/sampler.py", line 719, in forward sample_results = _sample(probs, logprobs, input_metadata) File "/home/bingxing2/home/scx6592/.conda/envs/python310/lib/python3.10/site-packages/vllm/model_executor/layers/sampler.py", line 1082, in _sample sample_results = _random_sample(seq_groups, is_prompts, File "/home/bingxing2/home/scx6592/.conda/envs/python310/lib/python3.10/site-packages/vllm/model_executor/layers/sampler.py", line 977, in _random_sample random_samples = torch.multinomial(probs, RuntimeError: probability tensor contains either
inf,
nanor elem
环境 vllm0.2.2+cuda118 A100 transformers==4.34.1 torch==2.1.2