Closed the-crypt-keeper closed 1 year ago
its working through vLLM but it turns out this is a new kind of AWQ:
https://github.com/casper-hansen/AutoAWQ
these models are not working with even the latest version of the old awq repo:
RuntimeError: shape '[1, 29, 1024]' is invalid for input of size 237568
So looks like will have to implement autoawq to replace the legacy awq.
Completed via vllm-awq.
Latest vLLM adds AWQ support, it would be interesting to compare performance vs the native awq executor we already support: