Closed the-crypt-keeper closed 9 months ago
mobiuslabsgmbh/Mixtral-8x7B-Instruct-v0.1-hf-attn-4bit-moe-2bit-HQQ
https://github.com/mobiusml/hqq
No transformers, custom runtime.
Completed, decent results but degradation is noticable. I managed only to get the non-compile pytorch backend to worked on an A100, inference was slow. Did not try the custom C++ backend.
mobiuslabsgmbh/Mixtral-8x7B-Instruct-v0.1-hf-attn-4bit-moe-2bit-HQQ
https://github.com/mobiusml/hqq
No transformers, custom runtime.