Open asfadfaf opened 2 months ago
Can you report the version of vllm you are using? This works fine on the last few releases:
pip show vllm
Name: vllm
Version: 0.5.3.post1
Summary: A high-throughput and memory-efficient inference and serving engine for LLMs
Home-page: https://github.com/vllm-project/vllm
Author: vLLM Team
Author-email:
License: Apache 2.0
Location: /home/mgoin/venvs/vllm-rel/lib/python3.10/site-packages
Requires: aiohttp, cmake, fastapi, filelock, lm-format-enforcer, ninja, numpy, nvidia-ml-py, openai, outlines, pillow, prometheus-client, prometheus-fastapi-instrumentator, psutil, py-cpuinfo, pydantic, pyzmq, ray, requests, sentencepiece, tiktoken, tokenizers, torch, torchvision, tqdm, transformers, typing-extensions, uvicorn, vllm-flash-attn, xformers
Required-by:
python
Python 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from vllm.utils import FlexibleArgumentParser
>>> FlexibleArgumentParser
<class 'vllm.utils.FlexibleArgumentParser'>
Here is my version:
Name: vllm Version: 0.5.0.post1 Summary: A high-throughput and memory-efficient inference and serving engine for LLMs Home-page: https://github.com/vllm-project/vllm Author: vLLM Team Author-email: License: Apache 2.0 Location: /usr/local/lib/python3.10/site-packages Requires: aiohttp, cmake, fastapi, filelock, lm-format-enforcer, ninja, numpy, nvidia-ml-py, openai, outlines, pillow, prometheus-client, prometheus-fastapi-instrumentator, psutil, py-cpuinfo, pydantic, ray, requests, sentencepiece, tiktoken, tokenizers, torch, transformers, typing-extensions, uvicorn, vllm-flash-attn, xformers Required-by:
Python 3.10.14 (main, May 29 2024, 23:47:02) [GCC 11.4.0] on linux Type "help", "copyright", "credits" or "license" for more information.
Please update your version of vllm, the latest is 0.5.4 but FlexibleArgumentParser was added in 0.5.1
Your current environment
🐛 Describe the bug
Even though I have updated the package to the latest version, the function call is still failing. File "/mnt/workspace/autodl-tmp/benchmark_throughput.py", line 15, in
from vllm.utils import FlexibleArgumentParser
ImportError: cannot import name 'FlexibleArgumentParser' from 'vllm.utils'