-
https://mistral.ai/news/codestral-mamba/
The latest model from Mistral utilizes the Mamba architecture (vs. Transformers) and targets code generation with strong performance on the leaderboards.
-
Hello,
I am currently training ArCHer with Mistral 7B on Twenty Questions using 32GB V100 GPUs, but it's taking longer than expected. Could you share any advice on parameter settings that might spe…
-
Hi ,
When I try to run this command '!mistral-demo $7B_DIR', I encounter an issue related to the GPU. Could you please suggest a solution? I am using Google Colab
![issue pic ](https://github.com…
-
-
**Describe**
Thank you for your team's contribution! I would like to fine-tune E5-mistral-7b-instruct for tasks that interest me. Do you have plans to open-source training code? Alternatively, are th…
-
Hello
Thanks for the code, it seems great.
I have just pulled your code and try to run.
I already called
"ollama run mistral" on my local Mac and it seems to work.
But when I tried …
-
hi, can I ask about the beta and leaning rate of Mistral-7B-Instruct-DPO? I can't reproduce the results in the paper.
-
**Title:** Evaluation Code Produces Identical Results with Different Caching Methods
**Description:**
It seems the evaluation code leads to the same result with different caching methods. I used…
-
In the current list of default models there is an odd ommission. While most models also have a 32 Bit variant available, Mistral 7B does not.
The practical result is that Linux users are missing ou…
-
Dear Eagle Team:
Hello, and thank you very much for your excellent work for the community. Recently, while attempting to replicate Eagle, I encountered some issues that I have been unable to resolv…