casper-hansen / AutoAWQ

AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
https://casper-hansen.github.io/AutoAWQ/
MIT License
1.64k stars 195 forks source link

Doesn't work with Zephyr #114

Closed p-christ closed 11 months ago

p-christ commented 11 months ago

When using autoawq with Zephyr I get this error. Anyone know how to fix?

[/usr/local/lib/python3.10/dist-packages/accelerate/utils/modeling.py](https://localhost:8080/#) in set_module_tensor_to_device(module, tensor_name, device, value, dtype, fp16_statistics)
    274 
    275     if tensor_name not in module._parameters and tensor_name not in module._buffers:
--> 276         raise ValueError(f"{module} does not have a parameter or a buffer named {tensor_name}.")
    277     is_buffer = tensor_name in module._buffers
    278     old_value = getattr(module, tensor_name)

ValueError: WQLinear_GEMM(in_features=4096, out_features=4096, bias=False, w_bit=4, group_size=128) does not have a parameter or a buffer named weight.

This is my code


!pip install autoawq -q

from awq import AutoAWQForCausalLM

model_name_or_path = "HuggingFaceH4/zephyr-7b-alpha" 
model = AutoAWQForCausalLM.from_quantized(model_name_or_path, fuse_layers=True,
                                          trust_remote_code=False)
casper-hansen commented 11 months ago

I have now seen this kind of issue reported twice, leading me to think it is related to an update in accelerate or Huggingface libraries. I will have to further investigate what happened in order to determine the cause. Can you comment with your pip list?

p-christ commented 11 months ago

thanks for your reply - below are my packages, any help much appreciated!

absl-py==1.4.0
accelerate==0.23.0
aiohttp==3.8.6
aiohttp-cors==0.7.0
aiosignal==1.3.1
alabaster==0.7.13
albumentations==1.3.1
altair==4.2.2
anyio==3.7.1
appdirs==1.4.4
argcomplete==3.1.2
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
array-record==0.4.1
arviz==0.15.1
astropy==5.3.4
astunparse==1.6.3
async-timeout==4.0.3
atpublic==4.0
attributedict==0.3.0
attrs==23.1.0
audioread==3.0.1
autoawq==0.1.4
autograd==1.6.2
Babel==2.13.0
backcall==0.2.0
beautifulsoup4==4.11.2
bidict==0.22.1
bigframes @ file:///bigframes-latest-py2.py3-none-any.whl#sha256=e0ad1d2fb06dd9b0f1037e726e3fdf196417ea7da67d30c2c4d3abf43df5db7f
bleach==6.1.0
blessed==1.20.0
blessings==1.7
blinker==1.4
blis==0.7.11
blosc2==2.0.0
bokeh==3.2.2
bqplot==0.12.40
branca==0.6.0
build==1.0.3
CacheControl==0.13.1
cachetools==5.3.1
catalogue==2.0.10
certifi==2023.7.22
cffi==1.16.0
cfgv==3.4.0
chardet==5.2.0
charset-normalizer==3.3.0
chex==0.1.7
click==8.1.7
click-plugins==1.1.1
cligj==0.7.2
cloudpickle==2.2.1
cmake==3.27.6
cmdstanpy==1.2.0
codecov==2.1.13
colorama==0.4.6
colorcet==3.0.1
coloredlogs==15.0.1
colorful==0.5.5
colorlog==6.7.0
colorlover==0.3.0
colour==0.1.5
colour-runner==0.1.1
community==1.0.0b1
confection==0.1.3
cons==0.4.6
contextlib2==21.6.0
contourpy==1.1.1
coverage==7.3.2
cryptography==41.0.4
cufflinks==0.17.3
cupy-cuda11x==11.0.0
cvxopt==1.3.2
cvxpy==1.3.2
cycler==0.12.1
cymem==2.0.8
Cython==3.0.3
dask==2023.8.1
DataProperty==1.0.1
datascience==0.17.6
datasets==2.14.5
db-dtypes==1.1.1
dbus-python==1.2.18
debugpy==1.6.6
decorator==4.4.2
deepdiff==6.6.1
defusedxml==0.7.1
dill==0.3.7
distlib==0.3.7
distributed==2023.8.1
distro==1.7.0
dlib==19.24.2
dm-tree==0.1.8
docutils==0.18.1
dopamine-rl==4.0.6
duckdb==0.8.1
earthengine-api==0.1.374
easydict==1.10
ecos==2.0.12
editdistance==0.6.2
eerepr==0.0.4
en-core-web-sm @ https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.6.0/en_core_web_sm-3.6.0-py3-none-any.whl#sha256=83276fc78a70045627144786b52e1f2728ad5e29e5e43916ec37ea9c26a11212
entrypoints==0.4
et-xmlfile==1.1.0
etils==1.5.0
etuples==0.3.9
exceptiongroup==1.1.3
fastai==2.7.12
fastcore==1.5.29
fastdownload==0.0.7
fastjsonschema==2.18.1
fastprogress==1.0.3
fastrlock==0.8.2
filelock==3.12.4
Fiona==1.9.4.post1
firebase-admin==5.3.0
Flask==2.2.5
flatbuffers==23.5.26
flax==0.7.4
folium==0.14.0
fonttools==4.43.1
frozendict==2.3.8
frozenlist==1.4.0
fsspec==2023.6.0
future==0.18.3
gast==0.4.0
gcsfs==2023.6.0
GDAL==3.4.3
gdown==4.6.6
geemap==0.28.2
gensim==4.3.2
geocoder==1.38.1
geographiclib==2.0
geopandas==0.13.2
geopy==2.3.0
gin-config==0.5.0
glob2==0.7
google==2.0.3
google-api-core==2.11.1
google-api-python-client==2.84.0
google-auth==2.17.3
google-auth-httplib2==0.1.1
google-auth-oauthlib==1.0.0
google-cloud-aiplatform==1.35.0
google-cloud-bigquery==3.10.0
google-cloud-bigquery-connection==1.12.1
google-cloud-bigquery-storage==2.22.0
google-cloud-core==2.3.3
google-cloud-datastore==2.15.2
google-cloud-firestore==2.11.1
google-cloud-functions==1.13.3
google-cloud-iam==2.12.2
google-cloud-language==2.9.1
google-cloud-resource-manager==1.10.4
google-cloud-storage==2.8.0
google-cloud-testutils==1.3.3
google-cloud-translate==3.11.3
google-colab @ file:///colabtools/dist/google-colab-1.0.0.tar.gz#sha256=1afa89808ae9af63a4b5104b6ece646351ad97cc78340573b461899938a5cee1
google-crc32c==1.5.0
google-pasta==0.2.0
google-resumable-media==2.6.0
googleapis-common-protos==1.60.0
googledrivedownloader==0.4
gpustat==1.1.1
graphviz==0.20.1
greenlet==3.0.0
grpc-google-iam-v1==0.12.6
grpcio==1.51.3
grpcio-status==1.48.2
gspread==3.4.2
gspread-dataframe==3.3.1
gym==0.25.2
gym-notices==0.0.8
h5netcdf==1.2.0
h5py==3.9.0
holidays==0.34
holoviews==1.17.1
html5lib==1.1
httpimport==1.3.1
httplib2==0.22.0
huggingface-hub==0.17.3
humanfriendly==10.0
humanize==4.7.0
hyperopt==0.2.7
ibis-framework==6.2.0
identify==2.5.30
idna==3.4
imageio==2.31.5
imageio-ffmpeg==0.4.9
imagesize==1.4.1
imbalanced-learn==0.10.1
imgaug==0.4.0
importlib-metadata==6.8.0
importlib-resources==6.1.0
imutils==0.5.4
inflect==7.0.0
iniconfig==2.0.0
inspecta==0.1.3
intel-openmp==2023.2.0
ipyevents==2.0.2
ipyfilechooser==0.6.0
ipykernel==5.5.6
ipyleaflet==0.17.4
ipython==7.34.0
ipython-genutils==0.2.0
ipython-sql==0.5.0
ipytree==0.2.2
ipywidgets==7.7.1
itsdangerous==2.1.2
jax==0.4.16
jaxlib @ https://storage.googleapis.com/jax-releases/cuda11/jaxlib-0.4.16+cuda11.cudnn86-cp310-cp310-manylinux2014_x86_64.whl#sha256=78b3a9acfda4bfaae8a1dc112995d56454020f5c02dba4d24c40c906332efd4a
jedi==0.19.1
jeepney==0.7.1
jieba==0.42.1
Jinja2==3.1.2
joblib==1.3.2
jsonlines==4.0.0
jsonpickle==3.0.2
jsonschema==4.19.1
jsonschema-specifications==2023.7.1
jupyter-client==6.1.12
jupyter-console==6.1.0
jupyter-server==1.24.0
jupyter_core==5.4.0
jupyterlab-pygments==0.2.2
jupyterlab-widgets==3.0.9
kaggle==1.5.16
keras==2.13.1
keyring==23.5.0
kiwisolver==1.4.5
langcodes==3.3.0
launchpadlib==1.10.16
lazr.restfulclient==0.14.4
lazr.uri==1.0.6
lazy_loader==0.3
libclang==16.0.6
librosa==0.10.1
lightgbm==4.0.0
linkify-it-py==2.0.2
lit==17.0.2
llvmlite==0.39.1
lm-eval==0.3.0
locket==1.0.0
logical-unification==0.4.6
lxml==4.9.3
malloy==2023.1056
Markdown==3.5
markdown-it-py==3.0.0
MarkupSafe==2.1.3
matplotlib==3.7.1
matplotlib-inline==0.1.6
matplotlib-venn==0.11.9
mbstrdecoder==1.1.3
mdit-py-plugins==0.4.0
mdurl==0.1.2
miniKanren==1.0.3
missingno==0.5.2
mistune==0.8.4
mizani==0.9.3
mkl==2023.2.0
ml-dtypes==0.3.1
mlxtend==0.22.0
more-itertools==10.1.0
moviepy==1.0.3
mpmath==1.3.0
msgpack==1.0.7
multidict==6.0.4
multipledispatch==1.0.0
multiprocess==0.70.15
multitasking==0.0.11
murmurhash==1.0.10
music21==9.1.0
natsort==8.4.0
nbclassic==1.0.0
nbclient==0.8.0
nbconvert==6.5.4
nbformat==5.9.2
nest-asyncio==1.5.8
networkx==3.1
nibabel==4.0.2
nltk==3.8.1
nodeenv==1.8.0
notebook==6.5.5
notebook_shim==0.2.3
nox==2023.4.22
numba==0.56.4
numexpr==2.8.7
numpy==1.23.5
nvidia-ml-py==12.535.108
oauth2client==4.1.3
oauthlib==3.2.2
openai==0.28.1
opencensus==0.11.3
opencensus-context==0.1.3
opencv-contrib-python==4.8.0.76
opencv-python==4.8.0.76
opencv-python-headless==4.8.1.78
openpyxl==3.1.2
opt-einsum==3.3.0
optax==0.1.7
orbax-checkpoint==0.4.1
ordered-set==4.1.0
osqp==0.6.2.post8
packaging==23.2
pandas==1.5.3
pandas-datareader==0.10.0
pandas-gbq==0.19.2
pandas-stubs==1.5.3.230304
pandocfilters==1.5.0
panel==1.2.3
param==1.13.0
parso==0.8.3
parsy==2.1
partd==1.4.1
pathlib==1.0.1
pathvalidate==3.2.0
pathy==0.10.2
patsy==0.5.3
peewee==3.16.3
pexpect==4.8.0
pickleshare==0.7.5
Pillow==9.4.0
pip-tools==6.13.0
platformdirs==3.11.0
plotly==5.15.0
plotnine==0.12.3
pluggy==1.3.0
polars==0.17.3
pooch==1.7.0
portalocker==2.8.2
portpicker==1.5.2
pre-commit==3.4.0
prefetch-generator==1.0.3
preshed==3.0.9
prettytable==3.9.0
proglog==0.1.10
progressbar2==4.2.0
prometheus-client==0.17.1
promise==2.3
prompt-toolkit==3.0.39
prophet==1.1.5
proto-plus==1.22.3
protobuf==3.20.3
psutil==5.9.5
psycopg2==2.9.9
ptyprocess==0.7.0
py-cpuinfo==9.0.0
py-spy==0.3.14
py4j==0.10.9.7
pyarrow==9.0.0
pyasn1==0.5.0
pyasn1-modules==0.3.0
pybind11==2.11.1
pycocotools==2.0.7
pycountry==22.3.5
pycparser==2.21
pyct==0.5.0
pydantic==1.10.13
pydata-google-auth==1.8.2
pydot==1.4.2
pydot-ng==2.0.0
pydotplus==2.0.2
PyDrive==1.3.1
PyDrive2==1.6.3
pyerfa==2.0.0.3
pygame==2.5.2
Pygments==2.16.1
PyGObject==3.42.1
PyJWT==2.3.0
pymc==5.7.2
pymystem3==0.2.0
PyOpenGL==3.1.7
pyOpenSSL==23.2.0
pyparsing==3.1.1
pyperclip==1.8.2
pyproj==3.6.1
pyproject-api==1.6.1
pyproject_hooks==1.0.0
pyshp==2.3.1
PySocks==1.7.1
pytablewriter==1.2.0
pytensor==2.14.2
pytest==7.4.2
pytest-mock==3.11.1
python-apt==0.0.0
python-box==7.1.1
python-dateutil==2.8.2
python-louvain==0.16
python-slugify==8.0.1
python-utils==3.8.1
pytz==2023.3.post1
pyviz_comms==3.0.0
PyWavelets==1.4.1
PyYAML==6.0.1
pyzmq==23.2.1
qdldl==0.1.7.post0
qudida==0.0.4
ratelim==0.1.6
ray==2.4.0
referencing==0.30.2
regex==2023.6.3
requests==2.31.0
requests-oauthlib==1.3.1
requirements-parser==0.5.0
rich==13.6.0
rootpath==0.1.1
rouge-score==0.1.2
rpds-py==0.10.4
rpy2==3.4.2
rsa==4.9
sacrebleu==1.5.0
safetensors==0.4.0
scikit-image==0.19.3
scikit-learn==1.2.2
scipy==1.11.3
scooby==0.7.4
scs==3.2.3
seaborn==0.12.2
SecretStorage==3.3.1
Send2Trash==1.8.2
sentencepiece==0.1.99
shapely==2.0.1
six==1.16.0
sklearn-pandas==2.2.0
smart-open==6.4.0
sniffio==1.3.0
snowballstemmer==2.2.0
sortedcontainers==2.4.0
soundfile==0.12.1
soupsieve==2.5
soxr==0.3.7
spacy==3.6.1
spacy-legacy==3.0.12
spacy-loggers==1.0.5
Sphinx==5.0.2
sphinxcontrib-applehelp==1.0.7
sphinxcontrib-devhelp==1.0.5
sphinxcontrib-htmlhelp==2.0.4
sphinxcontrib-jsmath==1.0.1
sphinxcontrib-qthelp==1.0.6
sphinxcontrib-serializinghtml==1.1.9
SQLAlchemy==2.0.21
sqlglot==17.16.2
sqlitedict==2.1.0
sqlparse==0.4.4
srsly==2.4.8
stanio==0.3.0
statsmodels==0.14.0
sympy==1.12
tabledata==1.3.3
tables==3.8.0
tabulate==0.9.0
tbb==2021.10.0
tblib==2.0.0
tcolorpy==0.1.4
tenacity==8.2.3
tensorboard==2.13.0
tensorboard-data-server==0.7.1
tensorflow==2.13.0
tensorflow-datasets==4.9.3
tensorflow-estimator==2.13.0
tensorflow-gcs-config==2.13.0
tensorflow-hub==0.15.0
tensorflow-io-gcs-filesystem==0.34.0
tensorflow-metadata==1.14.0
tensorflow-probability==0.20.1
tensorstore==0.1.45
termcolor==2.3.0
terminado==0.17.1
text-unidecode==1.3
textblob==0.17.1
texttable==1.7.0
tf-slim==1.1.0
thinc==8.1.12
threadpoolctl==3.2.0
tifffile==2023.9.26
tinycss2==1.2.1
tokenizers==0.14.1
toml==0.10.2
tomli==2.0.1
toolz==0.12.0
torch @ https://download.pytorch.org/whl/cu118/torch-2.0.1%2Bcu118-cp310-cp310-linux_x86_64.whl#sha256=a7a49d459bf4862f64f7bc1a68beccf8881c2fa9f3e0569608e16ba6f85ebf7b
torchaudio @ https://download.pytorch.org/whl/cu118/torchaudio-2.0.2%2Bcu118-cp310-cp310-linux_x86_64.whl#sha256=26692645ea061a005c57ec581a2d0425210ac6ba9f923edf11cc9b0ef3a111e9
torchdata==0.6.1
torchsummary==1.5.1
torchtext==0.15.2
torchvision @ https://download.pytorch.org/whl/cu118/torchvision-0.15.2%2Bcu118-cp310-cp310-linux_x86_64.whl#sha256=19ca4ab5d6179bbe53cff79df1a855ee6533c2861ddc7389f68349d8b9f8302a
tornado==6.3.2
tox==4.11.3
tqdm==4.66.1
tqdm-multiprocess==0.0.11
traitlets==5.7.1
traittypes==0.2.1
transformers==4.34.1
triton==2.0.0
tweepy==4.13.0
typepy==1.3.2
typer==0.9.0
types-pytz==2023.3.1.1
types-setuptools==68.2.0.0
typing_extensions==4.5.0
tzlocal==5.1
uc-micro-py==1.0.2
uritemplate==4.1.1
urllib3==2.0.6
vega-datasets==0.9.0
virtualenv==20.24.5
wadllib==1.3.6
wasabi==1.1.2
wcwidth==0.2.8
webcolors==1.13
webencodings==0.5.1
websocket-client==1.6.4
Werkzeug==3.0.0
widgetsnbextension==3.6.6
wordcloud==1.9.2
wrapt==1.15.0
xarray==2023.7.0
xarray-einstats==0.6.0
xgboost==2.0.0
xlrd==2.0.1
xxhash==3.4.1
xyzservices==2023.10.0
yarl==1.9.2
yellowbrick==1.5
yfinance==0.2.31
zict==3.0.0
zipp==3.17.0
zstandard==0.21.0
p-christ commented 11 months ago

Any ideas?

casper-hansen commented 11 months ago

Upon further inspection, it looks like you are using the API wrong. To quantize a model, please follow one of the examples. You must not load an unquantized model with from_quantized.