go-skynet / model-gallery

:card_file_box: a curated collection of models ready-to-use with LocalAI
https://localai.io/models/
Apache License 2.0
259 stars 66 forks source link

Add Mixtral 8x7B Q3 and Q6 configurations #54

Closed hotspoons closed 9 months ago

hotspoons commented 9 months ago

I added configurations for mixtral-8x7B instruct using Q6_K and Q3_K_M quantizations in this PR. These models are licensed unambiguously as Apache 2.0. The Q6 quantization configuration is setup for fully offloading to GPU, while the Q3 is setup for CPU inference; I didn't see guidance on which is the preferred setup but I would be happy to update these configurations to match a preferred default.

I did notice occasional weirdness and garbling with the output using ChatGPT Next Web as the client, but it could be due to the event stream voodoo that client uses 🤷.