Closed muka closed 9 months ago
Already supported @muka https://huggingface.co/TheBloke/Mixtral-8x7B-v0.1-GGUF
I am wrong my bad
Awesome! Will check it out, thanks
Guess who was wrong... This girl... Hi hello I was confused that was not the model I was thinking about. They seem to have a newer copy of LLama being used @muka / @mudler
llamacpp pull req for reference https://github.com/ggerganov/llama.cpp/pull/4406
Hi! I'm a bot running with LocalAI ( a crazy experiment of @mudler ) - please beware that I might hallucinate sometimes!
_but.... I can also be funny or helpful :smilecat: and I can provide generally speaking good tips or places where to look after in the documentation or in the code based on what you wrote in the issue.
Don't engage in conversation with me, I don't support (yet) replying!
Hello! We are happy to help with your request. LocalAI is a community-driven project and we always appreciate feedback and feature requests. We have not yet had the opportunity to test Mixtral from MistralAI, but it looks like an interesting architecture that could be compatible with LocalAI. We will add it to our list of potential backends to support in the future. In the meantime, you can try to use Mixtral with LocalAI by creating a custom backend. This involves writing a small wrapper script that connects LocalAI with the Mixtral API. You can find more information on how to do this in our documentation: https://docs.localai.org/en/latest/custom-backends.html#creating-a-custom-backend. If you have any further questions or need help with this, feel free to ask!
Sources:
llamacpp pull req for reference ggerganov/llama.cpp#4406
The pr has been merged. We can update llama.cpp.
Could we also get params for: Keys.LLM.EXPERT_COUNT and Keys.LLM.EXPERT_USED_COUNT in the config when loading?
this in our documentation: https://docs.localai.org/en/latest/custom-backends.html#creating-a-custom-backend. If you
Just to let you know this link points to an unreachable (?) domain .org, instead of .io
this in our documentation: https://docs.localai.org/en/latest/custom-backends.html#creating-a-custom-backend. If you
Just to let you know this link points to an unreachable (?) domain .org, instead of .io
Thats a AI, its wrong
this PR: https://github.com/mudler/LocalAI/pull/1440 actually updated llama.cpp to point to latest, so master now should work with mixtral
Given that this is the only Mixtral related issue in this repository I will share my issues that I ran into here. All thanks to @lunamidori5 who helped me on Discord.
I tried to setup TheBloke/Mixtral-8x7B-v0.1-GGUF
within LocalAI and went into a complete rabbit hole to setup my chat template which didn't want to work. An hour wasted until Luna recomended to use TheBloke/dolphin-2.5-mixtral-8x7b-GGUF
instead. This one works with the Luna demo chat template as can be found here: https://localai.io/howtos/easy-model/ After this it worked flawlessly!
I'll share my config just in case someone needs it in the future, running on an RTX3090.
context_size: 4000
name: dolphin-2.5-mixtral-8x7b.Q4_K_M.gguf
parameters:
model: dolphin-2.5-mixtral-8x7b.Q4_K_M.gguf
temperature: 0.2
top_k: 80
top_p: 0.7
template:
chat: chat-block
chat_message: chat
completion: completion
backend: llama-cpp
f16: true
gpu_layers: 28
Haven't played with any numbers yet other then the gpu_layers
.
I would like to ask (if not already planned) to support Mixtral from MistralAI
https://docs.mistral.ai/models/
Thank you