[X] 1. I have searched related issues but cannot get the expected help.
[ ] 2. The bug has not been fixed in the latest version.
[X] 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
Checklist
Describe the bug
问题
我使用
pipline
对InternVL2-8B
进行部署,总共有4卡V100
相关issue
我查阅了相关的issue,发现最接近的是https://github.com/InternLM/lmdeploy/issues/2250#issue-2452254301 ,但是这个issue最终并没有解决。参考issue的做法,我设置了TM_DEBUG_LEVEL= DEBUG以及初始化pipline设置log_level=INFO,我将最终的运行结果会放在
Error traceback
小节。我怀疑过是NCCL的问题,但是并没有出现NCCL的Error。Conda环境
absl-py 2.1.0 accelerate 0.33.0 addict 2.4.0 aiofiles 24.1.0 aiohttp 3.9.5 aiosignal 1.3.1 altair 5.3.0 annotated-types 0.7.0 anyio 4.4.0 asttokens 2.4.1 async-timeout 4.0.3 attrs 23.2.0 backcall 0.2.0 beautifulsoup4 4.12.3 bitsandbytes 0.41.0 blinker 1.8.2 cachetools 5.4.0 certifi 2024.7.4 charset-normalizer 3.3.2 click 8.1.7 colorama 0.4.6 comm 0.2.2 contourpy 1.2.1 cycler 0.12.1 datasets 2.20.0 debugpy 1.6.7 decorator 5.1.1 decord 0.6.0 deepspeed 0.13.5 dill 0.3.8 distro 1.9.0 dnspython 2.6.1 einops 0.8.0 einops-exts 0.0.4 email_validator 2.2.0 entrypoints 0.4 et-xmlfile 1.1.0 exceptiongroup 1.2.2 executing 2.0.1 fastapi 0.111.1 fastapi-cli 0.0.4 ffmpy 0.3.3 filelock 3.15.4 fire 0.6.0 fonttools 4.53.1 frozenlist 1.4.1 fsspec 2024.5.0 future 1.0.0 gdown 5.2.0 gitdb 4.0.11 GitPython 3.1.43 gradio 3.35.2 gradio_client 0.2.9 grpcio 1.65.1 h11 0.14.0 hjson 3.1.0 httpcore 0.17.3 httptools 0.6.1 httpx 0.24.0 huggingface-hub 0.24.3 idna 3.7 imageio 2.34.2 importlib_metadata 8.2.0 importlib_resources 6.4.0 ipykernel 6.29.5 ipython 8.12.0 jedi 0.19.1 Jinja2 3.1.4 joblib 1.4.2 jsonschema 4.23.0 jsonschema-specifications 2023.12.1 jupyter-client 7.3.4 jupyter_core 5.7.2 kiwisolver 1.4.5 latex2mathml 3.77.0 linkify-it-py 2.0.3 llava 1.7.0.dev0 lmdeploy 0.5.2.post1 Markdown 3.6 markdown-it-py 2.2.0 markdown2 2.5.0 MarkupSafe 2.1.5 matplotlib 3.9.1 matplotlib-inline 0.1.7 mdit-py-plugins 0.3.3 mdurl 0.1.2 mmcls 0.25.0 mmcv-full 1.6.2 mmengine-lite 0.10.4 mmsegmentation 0.30.0 model-index 0.1.11 mpmath 1.3.0 multidict 6.0.5 multiprocess 0.70.16 nest_asyncio 1.6.0 networkx 3.2.1 ninja 1.11.1.1 numpy 1.26.4 nvidia-cublas-cu12 12.1.3.1 nvidia-cuda-cupti-cu12 12.1.105 nvidia-cuda-nvrtc-cu12 12.1.105 nvidia-cuda-runtime-cu12 12.1.105 nvidia-cudnn-cu12 8.9.2.26 nvidia-cufft-cu12 11.0.2.54 nvidia-curand-cu12 10.3.2.106 nvidia-cusolver-cu12 11.4.5.107 nvidia-cusparse-cu12 12.1.0.106 nvidia-nccl-cu12 2.19.3 nvidia-nvjitlink-cu12 12.5.82 nvidia-nvtx-cu12 12.1.105 openai 1.37.1 opencv-python 4.10.0.84 opendatalab 0.0.10 openmim 0.3.9 openpyxl 3.1.5 openxlab 0.0.11 ordered-set 4.1.0 orjson 3.10.6 packaging 24.1 pandas 2.2.2 parso 0.8.4 peft 0.11.1 pexpect 4.9.0 pickleshare 0.7.5 Pillow 9.5.0 pip 24.0 platformdirs 4.2.2 prettytable 3.10.2 prompt_toolkit 3.0.47 protobuf 4.25.4 psutil 5.9.0 ptyprocess 0.7.0 pure_eval 0.2.3 py-cpuinfo 9.0.0 pyarrow 17.0.0 pyarrow-hotfix 0.6 pycocoevalcap 1.2 pycocotools 2.0.8 pycryptodome 3.20.0 pydantic 2.8.2 pydantic_core 2.20.1 pydeck 0.9.1 pydub 0.25.1 Pygments 2.18.0 pynvml 11.5.3 pyodps 0.11.6.2 pyparsing 3.1.2 PySocks 1.7.1 python-dateutil 2.9.0 python-dotenv 1.0.1 python-multipart 0.0.9 pytz 2024.1 PyYAML 6.0.1 pyzmq 25.1.2 referencing 0.35.1 regex 2024.7.24 requests 2.32.3 rich 13.7.1 rpds-py 0.19.1 safetensors 0.4.3 scikit-learn 1.5.1 scipy 1.13.1 semantic-version 2.10.0 sentencepiece 0.1.99 setuptools 69.5.1 shellingham 1.5.4 shortuuid 1.0.13 six 1.16.0 smmap 5.0.1 sniffio 1.3.1 soupsieve 2.5 stack-data 0.6.2 starlette 0.37.2 streamlit 1.37.0 streamlit-image-select 0.6.0 svgwrite 1.4.3 sympy 1.13.1 tabulate 0.9.0 tenacity 8.5.0 tensorboard 2.17.0 tensorboard-data-server 0.7.2 tensorboardX 2.6.2.2 termcolor 2.4.0 threadpoolctl 3.5.0 tiktoken 0.7.0 timm 0.9.12 tokenizers 0.19.1 toml 0.10.2 tomli 2.0.1 toolz 0.12.1 torch 2.2.2 torchvision 0.17.2 tornado 6.1 tqdm 4.66.4 traitlets 5.14.3 transformers 4.43.3 transformers-stream-generator 0.0.5 triton 2.2.0 typer 0.12.3 typing_extensions 4.12.2 tzdata 2024.1 uc-micro-py 1.0.3 urllib3 2.2.2 uvicorn 0.30.3 uvloop 0.19.0 watchdog 4.0.1 watchfiles 0.22.0 wavedrom 2.0.3.post3 wcwidth 0.2.13 websockets 12.0 Werkzeug 3.0.3 wheel 0.43.0 xxhash 3.4.1 yacs 0.1.8 yapf 0.40.1 yarl 1.9.4 zipp 3.19.2
Reproduction
完整代码
Environment
Error traceback