Open michaelfeil opened 4 months ago
Hi @michaelfeil, any chance you will look more closely into quantizing BERT models with AWQ? Your PR was off to a great start, but needs more experimentation to figure out how to scale a BERT model.
@casper-hansen open for collaboration, but no further progress unfortunately.
Hoping to add a implementation of 4bit Bert, potentially in https://github.com/casper-hansen/AutoAWQ/pull/328. Contributions welcome