AssertionError when AWQ quantizing Qwen2-72b-Instruct

Describe the bug What the bug is, and how to reproduce, better with screenshots(描述bug以及复现过程，最好有截图) 使用以下脚本awq量化qwen2-72b-instruct CUDA_VISIBLE_DEVICES=0 swift export \ --model_type qwen2-72b-instruct --quant_bits 4 --quant_n_samples 128 \ --dataset alpaca-zh alpaca-en sharegpt-gpt4:default --quant_method awq

Your hardware and system info Write your system info like CUDA version/system/GPU/torch version here(在这里给出硬件信息和系统信息，如CUDA版本，系统，GPU型号和torch版本等) Ubuntu24.04lts, CUDA12.4.131, torch 2.3.1, 8x4090 以下是完整的pip freeze信息: absl-py==2.1.0 accelerate==0.30.1 addict==2.4.0 aiofiles==23.2.1 aiohttp==3.9.5 aiosignal==1.3.1 aliyun-python-sdk-core==2.15.1 aliyun-python-sdk-kms==2.16.3 altair==5.3.0 annotated-types==0.7.0 anyio==4.4.0 argon2-cffi==23.1.0 argon2-cffi-bindings==21.2.0 arrow==1.3.0 arxiv==2.1.0 asttokens==2.4.1 async-lru==2.0.4 attributedict==0.3.0 attrs==23.2.0 auto_gptq==0.7.1 -e git+https://github.com/casper-hansen/AutoAWQ@7a9081c85724611e3448b97eadb296bb2133f991#egg=autoawq autoawq_kernels==0.0.6 Babel==2.15.0 beautifulsoup4==4.12.3 binpacking==1.5.2 bitsandbytes==0.43.1 bleach==6.1.0 blessings==1.7 blinker==1.8.2 cachetools==5.3.3 certifi==2024.6.2 cffi==1.16.0 chardet==5.2.0 charset-normalizer==3.3.2 click==8.1.7 cloudpickle==3.0.0 cmake==3.29.3 codecov==2.1.13 colorama==0.4.6 coloredlogs==15.0.1 colour-runner==0.1.1 comm==0.2.2 contourpy==1.2.1 coverage==7.5.3 cpm-kernels==1.0.11 crcmod==1.7 cryptography==42.0.8 cycler==0.12.1 dacite==1.8.1 DataProperty==1.0.1 datasets==2.18.0 debugpy==1.8.1 decorator==5.1.1 decord==0.6.0 deepdiff==7.0.1 defusedxml==0.7.1 diffusers==0.25.0 dill==0.3.8 diskcache==5.6.3 distlib==0.3.8 distro==1.9.0 dnspython==2.6.1 docker-pycreds==0.4.0 docstring_parser==0.16 editdistance==0.8.1 einops==0.8.0 email_validator==2.1.1 et-xmlfile==1.1.0 executing==2.0.1 fastapi==0.111.0 fastapi-cli==0.0.4 fastjsonschema==2.19.1 feedparser==6.0.10 ffmpy==0.3.2 filelock==3.14.0 flash-attn==2.5.9.post1 fonttools==4.53.0 fqdn==1.5.1 frozenlist==1.4.1 fsspec==2024.2.0 func-timeout==4.3.5 future==1.0.0 gast==0.5.4 gekko==1.1.1 gitdb==4.0.11 GitPython==3.1.43 gradio==4.36.1 gradio_client==1.0.1 griffe==0.45.2 grpcio==1.64.1 h11==0.14.0 httpcore==1.0.5 httptools==0.6.1 httpx==0.27.0 huggingface-hub==0.23.3 humanfriendly==10.0 idna==3.7 importlib_metadata==7.1.0 importlib_resources==6.4.0 inspecta==0.1.3 interegular==0.3.3 ipykernel==6.29.4 ipython==8.25.0 ipywidgets==8.1.3 isoduration==20.11.0 jedi==0.19.1 jieba==0.42.1 Jinja2==3.1.4 jmespath==0.10.0 joblib==1.4.2 json5==0.9.25 jsonlines==4.0.0 jsonpointer==2.4 jsonschema==4.22.0 jsonschema-specifications==2023.12.1 jupyter==1.0.0 jupyter-console==6.6.3 jupyter-events==0.10.0 jupyter-lsp==2.2.5 jupyter_client==8.6.2 jupyter_core==5.7.2 jupyter_server==2.14.1 jupyter_server_terminals==0.5.3 jupyterlab==4.2.1 jupyterlab_pygments==0.3.0 jupyterlab_server==2.27.2 jupyterlab_widgets==3.0.11 kiwisolver==1.4.5 lagent==0.2.2 lark==1.1.9 linkify-it-py==2.0.3 llmuses==0.3.0 llvmlite==0.42.0 lm-eval==0.3.0 lm-format-enforcer==0.10.1 lxml==5.2.2 Markdown==3.6 markdown-it-py==2.2.0 MarkupSafe==2.1.5 matplotlib==3.9.0 matplotlib-inline==0.1.7 mbstrdecoder==1.1.3 mdit-py-plugins==0.3.3 mdurl==0.1.2 mistune==3.0.2 mmengine==0.10.4 modelscope==1.15.0 mpmath==1.3.0 ms-swift==2.1.0 msgpack==1.0.8 multidict==6.0.5 multiprocess==0.70.16 nbclient==0.10.0 nbconvert==7.16.4 nbformat==5.10.4 nest-asyncio==1.6.0 networkx==3.3 ninja==1.11.1.1 nltk==3.8.1 notebook==7.2.0 notebook_shim==0.2.4 numba==0.59.1 numexpr==2.10.0 numpy==1.26.4 nvidia-cublas-cu12==12.1.3.1 nvidia-cuda-cupti-cu12==12.1.105 nvidia-cuda-nvrtc-cu12==12.1.105 nvidia-cuda-runtime-cu12==12.1.105 nvidia-cudnn-cu12==8.9.2.26 nvidia-cufft-cu12==11.0.2.54 nvidia-curand-cu12==10.3.2.106 nvidia-cusolver-cu12==11.4.5.107 nvidia-cusparse-cu12==12.1.0.106 nvidia-ml-py==12.555.43 nvidia-nccl-cu12==2.20.5 nvidia-nvjitlink-cu12==12.5.40 nvidia-nvtx-cu12==12.1.105 openai==0.28.0 opencv-python==4.10.0.82 openpyxl==3.1.3 optimum==1.20.0 ordered-set==4.1.0 orjson==3.10.3 oss2==2.18.5 outlines==0.0.34 overrides==7.7.0 packaging==24.0 pandas==2.2.2 pandocfilters==1.5.1 parso==0.8.4 pathvalidate==3.2.0 peft==0.11.1 pexpect==4.9.0 phx-class-registry==4.1.0 pillow==10.3.0 platformdirs==4.2.2 plotly==5.22.0 pluggy==1.5.0 ply==3.11 portalocker==2.8.2 prometheus-fastapi-instrumentator==7.0.0 prometheus_client==0.20.0 prompt_toolkit==3.0.46 protobuf==4.25.3 psutil==5.9.8 ptyprocess==0.7.0 pure-eval==0.2.2 py-cpuinfo==9.0.0 pyarrow==16.1.0 pyarrow-hotfix==0.6 pybind11==2.12.0 pycountry==24.6.1 pycparser==2.22 pycryptodome==3.20.0 pydantic==2.7.3 pydantic_core==2.18.4 pydeck==0.9.1 pydub==0.25.1 Pygments==2.18.0 Pympler==1.0.1 pyparsing==3.1.2 pyproject-api==1.6.1 pytablewriter==1.2.0 python-dateutil==2.9.0.post0 python-dotenv==1.0.1 python-json-logger==2.0.7 python-multipart==0.0.9 pytz==2024.1 PyYAML==6.0.1 pyzmq==26.0.3 qtconsole==5.5.2 QtPy==2.4.1 ray==2.24.0 referencing==0.35.1 regex==2024.5.15 requests==2.31.0 requests-toolbelt==1.0.0 rfc3339-validator==0.1.4 rfc3986-validator==0.1.1 rich==13.7.1 rootpath==0.1.1 rouge==1.0.1 rouge-chinese==1.0.3 rouge-score==0.1.2 rpds-py==0.18.1 ruff==0.4.8 sacrebleu==1.5.0 safetensors==0.4.3 scikit-learn==1.5.0 scipy==1.13.1 seaborn==0.13.2 semantic-version==2.10.0 Send2Trash==1.8.3 sentencepiece==0.2.0 sentry-sdk==2.5.0 setproctitle==1.3.3 sgmllib3k==1.0.0 shellingham==1.5.4 shtab==1.7.1 simple-ddl-parser==1.5.1 simplejson==3.19.2 six==1.16.0 smmap==5.0.1 sniffio==1.3.1 sortedcontainers==2.4.0 soupsieve==2.5 sqlitedict==2.1.0 stack-data==0.6.3 starlette==0.37.2 streamlit==1.35.0 sympy==1.12.1 tabledata==1.3.3 tabulate==0.9.0 tcolorpy==0.1.6 tenacity==8.3.0 tensorboard==2.16.2 tensorboard-data-server==0.7.2 termcolor==2.4.0 terminado==0.18.1 texttable==1.7.0 threadpoolctl==3.5.0 tiktoken==0.7.0 tinycss2==1.3.0 tokenizers==0.19.1 toml==0.10.2 tomli==2.0.1 tomlkit==0.12.0 toolz==0.12.1 torch==2.3.1 torchvision==0.18.1 tornado==6.4.1 tox==4.15.1 tqdm==4.66.4 tqdm-multiprocess==0.0.11 traitlets==5.14.3 transformers==4.41.2 transformers-stream-generator==0.0.5 triton==2.3.1 trl==0.9.4 typepy==1.3.2 typer==0.12.3 types-python-dateutil==2.9.0.20240316 typing_extensions==4.12.1 tyro==0.8.4 tzdata==2024.1 uc-micro-py==1.0.3 ujson==5.10.0 uri-template==1.3.0 urllib3==2.2.1 uvicorn==0.30.1 uvloop==0.19.0 virtualenv==20.26.2 vllm==0.4.3 vllm-flash-attn==2.5.8.post2 wandb==0.17.1 watchdog==4.0.1 watchfiles==0.22.0 wcwidth==0.2.13 webcolors==24.6.0 webencodings==0.5.1 websocket-client==1.8.0 websockets==11.0.3 Werkzeug==3.0.3 widgetsnbextension==4.0.11 xformers==0.0.26.post1 xtuner==0.1.11 xxhash==3.4.1 yapf==0.40.2 yarl==1.9.4 zipp==3.19.2 zstandard==0.22.0

Additional context Add any other context about the problem here(在这里补充其他信息)

modelscope / swift

AssertionError when AWQ quantizing Qwen2-72b-Instruct #1111