QwenLM / Qwen

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
Apache License 2.0
13.59k stars 1.11k forks source link

qwen-7b-chat 启动cli_demo,报错csrc/rotary 段错误(吐核) #1124

Closed Andy1018 closed 5 months ago

Andy1018 commented 7 months ago

是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?

该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?

当前行为 | Current Behavior

/home/user/anaconda3/envs/qwen-7b/lib/python3.11/site-packages/transformers/utils/generic.py:260: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead. torch.utils._pytree._register_pytree_node( The model is automatically converting to bf16 for faster inference. If you want to disable the automatic precision, please manually add bf16/fp16/fp32=True to "AutoModelForCausalLM.from_pretrained". Try importing flash-attention for faster inference... Warning: import flash_attn rotary fail, please install FlashAttention rotary to get higher efficiency https://github.com/Dao-AILab/flash-attention/tree/main/csrc/rotary 段错误(吐核)

期望行为 | Expected Behavior

No response

复现方法 | Steps To Reproduce

No response

运行环境 | Environment

- OS:
- Python:
- Transformers:
- PyTorch:
- CUDA (`python -c 'import torch; print(torch.version.cuda)'`):

备注 | Anything else?

No response

jklj077 commented 6 months ago

Please provide the related environment information. The results of pip list or conda list should be fine.

请提供相关的环境信息,pip listconda list的结果即可。

Andy1018 commented 6 months ago

(qwen-7b) [root@adsl-172-10-0-187 Qwen-main]# pip list Package Version Editable project location


accelerate 0.27.2 addict 2.4.0 aiofiles 23.2.1 aiohttp 3.9.3 aiosignal 1.3.1 aliyun-python-sdk-core 2.15.0 aliyun-python-sdk-kms 2.16.2 altair 5.2.0 annotated-types 0.6.0 anyio 4.3.0 attrs 23.2.0 auto_gptq 0.7.1 Brotli 1.0.9 certifi 2024.2.2 cffi 1.16.0 chardet 5.2.0 charset-normalizer 3.3.2 click 8.1.7 cloudpickle 3.0.0 colorama 0.4.6 coloredlogs 15.0.1 contourpy 1.2.0 crcmod 1.7 cryptography 42.0.5 cupy-cuda12x 12.1.0 cycler 0.12.1 datasets 2.18.0 deepspeed 0.13.4 dill 0.3.8 diskcache 5.6.3 dropout-layer-norm 0.1 einops 0.7.0 fastapi 0.110.0 fastrlock 0.8.2 ffmpy 0.3.2 filelock 3.13.1 flash-attn 2.5.6 fonttools 4.49.0 frozenlist 1.4.1 fschat 0.2.36 fsspec 2024.2.0 gast 0.5.4 gekko 1.0.7 gmpy2 2.1.2 gradio 4.20.1 gradio_client 0.11.0 h11 0.14.0 hjson 3.1.0 httpcore 1.0.4 httptools 0.6.1 httpx 0.27.0 huggingface-hub 0.21.3 humanfriendly 10.0 idna 3.6 importlib-metadata 7.0.1 importlib_resources 6.1.2 interegular 0.3.3 Jinja2 3.1.3 jmespath 0.10.0 joblib 1.3.2 jsonschema 4.21.1 jsonschema-specifications 2023.12.1 kiwisolver 1.4.5 lark 1.1.9 latex2mathml 3.77.0 llvmlite 0.42.0 Markdown 3.5.2 markdown-it-py 3.0.0 markdown2 2.4.13 MarkupSafe 2.1.5 matplotlib 3.8.3 mdtex2html 1.3.0 mdurl 0.1.2 mkl-fft 1.3.8 mkl-random 1.2.4 mkl-service 2.4.0 modelscope 1.12.0 mpi4py 3.1.4 mpmath 1.3.0 msgpack 1.0.8 multidict 6.0.5 multiprocess 0.70.16 nest-asyncio 1.6.0 networkx 3.2.1 nh3 0.2.15 ninja 1.11.1.1 numba 0.59.0 numpy 1.26.4 nvidia-cublas-cu12 12.1.3.1 nvidia-cuda-cupti-cu12 12.1.105 nvidia-cuda-nvrtc-cu12 12.1.105 nvidia-cuda-runtime-cu12 12.1.105 nvidia-cudnn-cu12 8.9.2.26 nvidia-cufft-cu12 11.0.2.54 nvidia-curand-cu12 10.3.2.106 nvidia-cusolver-cu12 11.4.5.107 nvidia-cusparse-cu12 12.1.0.106 nvidia-nccl-cu12 2.19.3 nvidia-nvjitlink-cu12 12.3.101 nvidia-nvtx-cu12 12.1.105 openai 0.28.1 optimum 1.17.1 orjson 3.9.15 oss2 2.18.4 outlines 0.0.34 packaging 23.2 pandas 2.2.1 peft 0.9.0 pillow 10.2.0 pip 24.0 platformdirs 4.2.0 prometheus_client 0.20.0 prompt-toolkit 3.0.43 protobuf 4.25.3 psutil 5.9.8 py-cpuinfo 9.0.0 pyarrow 15.0.0 pyarrow-hotfix 0.6 pycparser 2.21 pycryptodome 3.20.0 pydantic 1.10.14 pydantic_core 2.16.3 pydub 0.25.1 Pygments 2.17.2 pynvml 11.5.0 pyparsing 3.1.1 pyproject 1.3.1 PySocks 1.7.1 python-dateutil 2.9.0.post0 python-dotenv 1.0.1 python-multipart 0.0.9 pytz 2024.1 PyYAML 6.0.1 ray 2.9.3 referencing 0.33.0 regex 2023.12.25 requests 2.31.0 rich 13.7.1 rotary-emb 0.1 rouge 1.0.1 rpds-py 0.18.0 ruff 0.3.1 safetensors 0.4.2 scipy 1.12.0 semantic-version 2.10.0 sentencepiece 0.2.0 setuptools 68.2.2 shellingham 1.5.4 shortuuid 1.0.12 simplejson 3.19.2 six 1.16.0 sniffio 1.3.1 sortedcontainers 2.4.0 sse-starlette 2.0.0 starlette 0.36.3 svgwrite 1.4.3 sympy 1.12 tiktoken 0.6.0 tokenizers 0.15.2 tomli 2.0.1 tomlkit 0.12.0 toolz 0.12.1 torch 2.2.0 torchaudio 2.2.0 torchvision 0.17.0 tqdm 4.66.2 transformers 4.38.2 transformers-stream-generator 0.0.4 triton 2.2.0 typer 0.9.0 typing_extensions 4.10.0 tzdata 2024.1 urllib3 2.2.1 uvicorn 0.27.1 uvloop 0.19.0 vllm 0.2.2+cu124 /home/qwen/vllm-gptq watchfiles 0.21.0 wavedrom 2.0.3.post3 wcwidth 0.2.13 websockets 11.0.3 wheel 0.41.2 xformers 0.0.24 xxhash 3.4.1 yapf 0.40.2 yarl 1.9.4 zipp 3.17.0

github-actions[bot] commented 5 months ago

This issue has been automatically marked as inactive due to lack of recent activity. Should you believe it remains unresolved and warrants attention, kindly leave a comment on this thread. 此问题由于长期未有新进展而被系统自动标记为不活跃。如果您认为它仍有待解决,请在此帖下方留言以补充信息。