Open hyliush opened 3 days ago
14784*2=29658
The offical Qwen2.5-72B-Instruct-GPTQ-Int4 model should be 29696.
If you have quantized the model yourself, please refer to our documentation: https://qwen.readthedocs.io/zh-cn/latest/quantization/gptq.html#troubleshooting
Thanks, I was previously using the version g32 from modelscope and the default configuration parameter was 29568
发件人: Ren Xuancheng @.> 发送时间: 2024年9月27日 12:44 收件人: QwenLM/Qwen2.5 @.> 抄送: hyliu @.>; Author @.> 主题: Re: [QwenLM/Qwen2.5] [Badcase]: Qwen2.5-72B-Instruct-GPTQ-Int4 input_size_per_partition (Issue #986)
14784*2=29658
The offical Qwen2.5-72B-Instruct-GPTQ-Int4 model should be 29696https://huggingface.co/Qwen/Qwen2.5-72B-Instruct-GPTQ-Int4/blob/de16ae5d56f73657b43d4cab6c4925600aa6de8d/config.json#L11.
If you have quantized the model yourself, please refer to our documentation: https://qwen.readthedocs.io/zh-cn/latest/quantization/gptq.html#troubleshooting
― Reply to this email directly, view it on GitHubhttps://github.com/QwenLM/Qwen2.5/issues/986#issuecomment-2378389772, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ALXIFUWMQCEL42UNXDW22FDZYTPDJAVCNFSM6AAAAABO5OFKZSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNZYGM4DSNZXGI. You are receiving this because you authored the thread.Message ID: @.***>
14784*2=29658
官方的Qwen2.5-72B-Instruct-GPTQ-Int4型号应该是29696。
如果你已经自行对模型进行了量化,请参考我们的文档:https ://qwen.readthedocs.io/zh-cn/latest/quantization/gptq.html#troubleshooting
请问padding后只有一个模型文件'/path/to/padded_model/pytorch_model.bin',大小130G+,复制了config.json等非.safetensors结尾的文件过去,但是量化加载该模型的时候会报错是什么原因呢?Traceback (most recent call last):
File "/app/gptq_qwen.py", line 65, in
Model Series
Qwen2.5
What are the models used?
Qwen2.5-72B-Instruct-GPTQ-Int4
What is the scenario where the problem happened?
Qwen2.5-72B-Instruct-GPTQ-Int4 params error
Is this badcase known and can it be solved using avaiable techniques?
Information about environment
absl-py 2.1.0 accelerate 0.31.0 adaseq 0.6.6 addict 2.4.0 aiohttp 3.9.5 aiosignal 1.3.1 albucore 0.0.12 albumentations 1.4.10 alias-free-torch 0.0.6 aliyun-python-sdk-core 2.15.1 aliyun-python-sdk-kms 2.16.3 aniso8601 9.0.1 annotated-types 0.7.0 antlr4-python3-runtime 4.9.3 anyio 4.4.0 apex 0.1 appdirs 1.4.4 argon2-cffi 23.1.0 argon2-cffi-bindings 21.2.0 arrow 1.3.0 asttokens 2.4.1 astunparse 1.6.3 async-lru 2.0.4 async-timeout 4.0.3 attrs 23.2.0 audioread 3.0.1 auto_gptq 0.7.1 autoawq 0.2.5 autoawq_kernels 0.0.6 av 12.2.0 Babel 2.15.0 basicsr 1.4.2 beartype 0.18.5 beautifulsoup4 4.12.3 bidict 0.23.1 binpacking 1.5.2 biopython 1.83 bitarray 2.9.2 bitsandbytes 0.43.1 bitstring 4.2.3 black 24.4.2 bleach 6.1.0 blis 0.7.11 blobfile 2.1.1 bmt-clipit 1.0 boto3 1.34.136 botocore 1.34.136 cachetools 5.3.3 catalogue 2.0.10 certifi 2024.2.2 cffi 1.16.0 cfgv 3.4.0 charset-normalizer 3.3.2 chumpy 0.70 cityscapesScripts 2.2.3 click 8.1.7 clip 1.0 cloudpathlib 0.18.1 cloudpickle 3.0.0 cmake 3.29.6 colorama 0.4.6 coloredlogs 14.0 comm 0.2.2 confection 0.1.5 ConfigArgParse 1.7 contextlib2 21.6.0 contourpy 1.2.1 control-ldm 0.0.1 crcmod 1.7 cryptography 42.0.8 cycler 0.12.1 cymem 2.0.8 Cython 0.29.36 dacite 1.8.1 dataclasses 0.6 datasets 2.18.0 ddpm-guided-diffusion 0.0.0 debugpy 1.8.2 decorator 4.4.2 decord 0.6.0 deepspeed 0.14.4 defusedxml 0.7.1 descartes 1.1.0 detectron2 0.6 dgl 2.1.0+cu121 diffusers 0.29.2 dill 0.3.8 diskcache 5.6.3 Distance 0.1.3 distlib 0.3.8 distro 1.9.0 dnspython 2.3.0 docstring_parser 0.16 easydict 1.13 easyrobust 0.2.4 edit-distance 1.0.6 editdistance 0.5.2 einops 0.8.0 email_validator 2.2.0 embeddings 0.0.8 emoji 2.12.1 espnet-tts-frontend 0.0.3 et-xmlfile 1.1.0 eventlet 0.36.1 exceptiongroup 1.2.1 executing 2.0.1 expecttest 0.2.1 face-alignment 1.4.1 fairscale 0.4.13 fairseq 0.12.2 fastai 2.7.15 fastapi 0.111.0 fastapi-cli 0.0.4 fastcore 1.5.48 fastdownload 0.0.7 fastjsonschema 2.20.0 fastprogress 1.0.3 fasttext 0.9.3 ffmpeg 1.4 ffmpeg-python 0.2.0 filelock 3.14.0 fire 0.6.0 flake8 7.1.0 flash_attn 2.5.9.post1 Flask 2.2.5 Flask-Cors 4.0.1 Flask-RESTful 0.3.10 Flask-SocketIO 5.3.6 flask-talisman 1.1.0 flatbuffers 24.3.25 fonttools 4.53.0 fqdn 1.5.1 frozenlist 1.4.1 fsspec 2024.2.0 ftfy 6.2.0 funasr 1.0.30 funcodec 0.2.0 funtextprocessing 0.1.1 future 1.0.0 fvcore 0.1.5.post20221221 g2p 2.0.0 g2p-en 2.1.0 gast 0.5.4 gekko 1.1.3 google-pasta 0.2.0 greenlet 3.0.3 grpcio 1.64.0 h11 0.14.0 h5py 3.11.0 hdbscan 0.8.37 hjson 3.1.0 httpcore 1.0.5 httptools 0.6.1 httpx 0.27.0 huggingface-hub 0.23.4 humanfriendly 10.0 hydra-core 1.3.2 HyperPyYAML 1.2.2 identify 2.5.36 idna 3.7 imageio 2.34.2 imageio-ffmpeg 0.4.9 imgaug 0.4.0 importlib_metadata 7.1.0 inflect 7.0.0 iniconfig 2.0.0 interegular 0.3.3 iopath 0.1.9 ipdb 0.13.13 ipykernel 6.29.4 ipython 8.24.0 isoduration 20.11.0 isort 5.13.2 itsdangerous 2.2.0 jaconv 0.3.4 jamo 0.4.1 jedi 0.19.1 jieba 0.42.1 Jinja2 3.1.4 jmespath 0.10.0 joblib 1.4.2 json-tricks 3.17.3 json5 0.9.25 jsonplus 0.8.0 jsonpointer 3.0.0 jsonschema 4.22.0 jsonschema-specifications 2023.12.1 jupyter_client 8.6.2 jupyter_core 5.7.2 jupyter-events 0.10.0 jupyter-lsp 2.2.5 jupyter_server 2.14.1 jupyter_server_terminals 0.5.3 jupyterlab 4.2.3 jupyterlab_pygments 0.3.0 jupyterlab_server 2.27.2 kaldiio 2.18.0 kantts 1.0.1 keras 3.3.3 kiwisolver 1.4.5 kornia 0.7.3 kornia_rs 0.1.4 kwsbp 0.0.6 langcodes 3.4.0 language_data 1.2.0 lap 0.4.0 lark 1.1.9 lazy_loader 0.4 libclang 18.1.1 librosa 0.10.1 lightning-utilities 0.11.3.post0 llvmlite 0.43.0 lm-format-enforcer 0.10.1 lmdb 1.5.1 local-attention 1.9.3 lpips 0.1.4 lxml 4.9.4 lyft-dataset-sdk 0.0.8 marisa-trie 1.2.0 Markdown 3.6 markdown-it-py 3.0.0 MarkupSafe 2.1.5 matplotlib 3.5.3 matplotlib-inline 0.1.7 mccabe 0.7.0 mdurl 0.1.2 megatron-util 1.3.2 MinDAEC 0.0.2 mir-eval 0.7 mistune 3.0.2 ml-collections 0.1.1 ml-dtypes 0.3.2 mmcls 0.25.0 mmcv-full 1.7.0 mmdet 2.28.2 mmdet3d 1.0.0a1 mmsegmentation 0.30.0 mock 5.1.0 modelscope 1.16.0 moviepy 1.0.3 mpi4py 3.1.6 mpmath 1.3.0 ms-swift 2.1.1.post2 msgpack 1.0.8 multidict 6.0.5 multiprocess 0.70.16 munkres 1.1.4 murmurhash 1.0.10 mypy-extensions 1.0.0 namex 0.0.8 nbclient 0.10.0 nbconvert 7.16.4 nbformat 5.10.4 nerfacc 0.2.2 nest-asyncio 1.6.0 networkx 3.3 ninja 1.11.1.1 nltk 3.8.1 nodeenv 1.9.1 notebook_shim 0.2.4 numba 0.60.0 numpy 1.26.3 nuscenes-devkit 1.1.11 nvdiffrast 0.3.1 nvidia-ml-py 12.555.43 omegaconf 2.3.0 onnx 1.16.1 onnxruntime 1.18.1 onnxsim 0.4.36 open-clip-torch 2.24.0 openai 1.35.7 opencv-python 4.10.0.84 opencv-python-headless 4.10.0.84 openpyxl 3.1.5 opt-einsum 3.3.0 optimum 1.20.0 optree 0.11.0 orjson 3.10.5 oss2 2.18.6 outlines 0.0.46 overrides 7.7.0 packaging 24.0 pai-easycv 0.11.6 paint-ldm 0.0.0 pandas 2.2.2 pandocfilters 1.5.1 panopticapi 0.1 panphon 0.20.0 parso 0.8.4 pathspec 0.12.1 peft 0.11.1 pexpect 4.9.0 phaseaug 1.0.1 pickleshare 0.7.5 pillow 10.2.0 pip 23.0.1 platformdirs 4.2.2 plotly 5.22.0 pluggy 1.5.0 plyfile 1.0.3 pointnet2 0.0.0 pooch 1.8.2 portalocker 2.8.2 pre-commit 3.7.1 preshed 3.0.9 prettytable 3.10.0 proglog 0.1.10 prometheus_client 0.20.0 prometheus-fastapi-instrumentator 7.0.0 prompt_toolkit 3.0.45 protobuf 3.20.3 psutil 5.9.8 ptflops 0.7.3 ptyprocess 0.7.0 pure-eval 0.2.2 py-cpuinfo 9.0.0 py-sound-connect 0.2.1 pyairports 2.1.1 pyarrow 16.1.0 pyarrow-hotfix 0.6 pybind11 2.13.1 pyclipper 1.3.0.post5 pycocoevalcap 1.2 pycocotools 2.0.8 pycodestyle 2.12.0 pycountry 24.6.1 pycparser 2.22 pycryptodome 3.20.0 pycryptodomex 3.20.0 pydantic 2.7.4 pydantic_core 2.18.4 pyDeprecate 0.3.2 pydot 2.0.0 pyflakes 3.2.0 Pygments 2.18.0 PyMCubes 0.1.4 pynini 2.1.5 pynndescent 0.5.13 pyparsing 3.1.2 pypinyin 0.44.0 pyquaternion 0.9.9 pysptk 0.1.18 pytest 8.2.2 pythainlp 5.0.4 python-crfsuite 0.9.10 python-dateutil 2.9.0.post0 python-dotenv 1.0.1 python-engineio 4.9.1 python-json-logger 2.0.7 python-multipart 0.0.9 python-socketio 5.11.3 pytorch-lightning 1.7.7 pytorch-metric-learning 2.5.0 pytorch-wavelets 1.3.0 pytorch-wpe 0.0.1 pytorch3d 0.7.6 pytz 2024.1 pyvi 0.1.1 PyWavelets 1.6.0 PyYAML 6.0.1 pyzmq 26.0.3 rapidfuzz 3.9.3 ray 2.31.0 referencing 0.35.1 regex 2024.5.15 requests 2.32.3 resampy 0.4.3 rfc3339-validator 0.1.4 rfc3986-validator 0.1.1 rich 13.7.1 rotary-embedding-torch 0.6.3 rouge 1.0.1 rouge-score 0.0.4 rpds-py 0.18.1 ruamel.yaml 0.18.6 ruamel.yaml.clib 0.2.8 s3transfer 0.10.2 sacrebleu 2.4.2 sacremoses 0.1.1 safetensors 0.4.3 scikit-image 0.24.0 scikit-learn 1.5.0 scipy 1.12.0 seaborn 0.13.2 Send2Trash 1.8.3 sentencepiece 0.2.0 seqeval 1.2.2 setuptools 70.1.1 Shapely 1.8.4 shellingham 1.5.4 shotdetect-scenedetect-lgss 0.0.4 shtab 1.7.1 simple-websocket 1.0.0 simplejson 3.19.2 six 1.16.0 sklearn-crfsuite 0.5.0 smart-open 7.0.4 smplx 0.1.28 sniffio 1.3.1 sortedcontainers 2.4.0 soundfile 0.12.1 soupsieve 2.5 sox 1.5.0 soxr 0.3.7 spacy 3.7.5 spacy-legacy 3.0.12 spacy-loggers 1.0.5 speechbrain 1.0.0 srsly 2.4.8 sse-starlette 2.1.2 stack-data 0.6.3 stanza 1.8.2 starlette 0.37.2 subword-nmt 0.3.8 sympy 1.12.1 tabulate 0.9.0 taming-transformers-rom1504 0.0.6 tenacity 8.4.2 tensorboard 2.17.0 tensorboard-data-server 0.7.2 tensorboardX 2.6.2.2 tensordict 0.4.0 tensorflow 2.16.1 tensorflow-estimator 2.15.0 tensorflow-io-gcs-filesystem 0.37.0 termcolor 2.4.0 terminado 0.18.1 terminaltables 3.1.10 text-unidecode 1.3 text2sql-lgesql 1.3.0 tf-slim 1.1.0 thinc 8.2.5 thop 0.1.1.post2209072238 threadpoolctl 3.5.0 tifffile 2024.6.18 tiktoken 0.7.0 timm 1.0.7 tinycss2 1.3.0 tinycudann 1.7+torch230cu121 tokenizers 0.19.1 toml 0.10.2 tomli 2.0.1 torch 2.3.0+cu121 torch-complex 0.4.4 torch-scatter 2.1.2 torchaudio 2.3.0+cu121 torchdata 0.7.1 torchmetrics 0.11.4 torchsde 0.2.6 torchsummary 1.5.1 torchvision 0.18.0+cu121 tornado 6.4.1 tqdm 4.66.4 traitlets 5.14.3 trampoline 0.1.2 transformers 4.41.2 transformers-stream-generator 0.0.5 trimesh 2.35.39 triton 2.3.1 trl 0.9.4 ttsfrd 0.2.1 typeguard 2.13.3 typer 0.12.3 types-python-dateutil 2.9.0.20240316 typing 3.7.4.3 typing_extensions 4.12.0 tyro 0.8.5 tzdata 2024.1 ujson 5.10.0 umap-learn 0.5.6 unicodecsv 0.14.1 unicodedata2 15.1.0 unicore 1.2.1 Unidecode 1.3.8 uri-template 1.3.0 urllib3 2.2.1 utils 1.0.2 uvicorn 0.30.1 uvloop 0.19.0 videofeatures-clipit 1.0 virtualenv 20.26.3 vllm 0.5.0.post1 vllm-flash-attn 2.5.9 wasabi 1.1.3 watchfiles 0.22.0 wcwidth 0.2.13 weasel 0.4.1 webcolors 24.6.0 webencodings 0.5.1 websocket-client 1.8.0 websockets 12.0 Werkzeug 3.0.3 wget 3.2 wheel 0.43.0 wrapt 1.16.0 wsproto 1.2.0 xformers 0.0.26.post1 xtcocotools 1.14 xxhash 3.4.1 yacs 0.1.8 yapf 0.30.0 yarl 1.9.4 zhconv 1.4.3 zipp 3.19.0 zstandard 0.22.0
Description
llm = LLM(model=model_path, trust_remote_code=True, enforce_eager=True, max_model_len=32768, tensor_parallel_size=2)
Qwen2.5-72B-Instruct-GPTQ-Int4报错 Weight input_size_per_partition = 14784 is not divisible by min_thread_k = 128.
vllm 0.5.0.post1 + Qwen2-72B-Instruct-GPTQ-Int4 正常使用