Open shaltielshmid opened 3 months ago
Hi @shaltielshmid 👋
Thanks for the request! I agree that getting support for the Mistral-Nemo
model would be nice 👍 At the moment we might not have bandwidth to jump on it directly.
If you have even a half ready PR please feel free to open it e.g. as a draft. I think that would help us and speed things a bit.
Btw, just as a side note. The recommended temperature for Mistral Nemo is 0.3, so good to set that as a default as well when testing.
Hi @ErikKaum
I opened the half ready PR, as you suggested. The model is loaded, but the results are gibberish.
Let me know if I can be of assistance going further.
Thank you 🙌 this already helps a lot.
At the moment I don't think there much else to do than continue debugging why the model gives gibberish so if you have bandwidth just go ahead 👍
@ErikKaum once NeMO is able to run through TGI, although the vocab size is > 130k, do you know if TGI will work with NeMO+ LoRA Adapter?
Thank you!
Hi @tensimixt!
Sorry for a slow response. At the top of my head I can't come up with a reason why it wouldn't work 🤔
Btw, just to verify, was this issue resolved through the #2254 PR or is this still valid?
Model description
This model was released by Mistral here, and is available on HuggingFace here. The model is meant to be a drop-in replacement for Mistral-7B, but requires some modifications to handle the use of the tekken tokenizer and the explicit definition of head-dim (see here). I've tried copying in the head-dim change into the code, but then the model's output was pure garbage, so I assume there's something else that I'm missing.
Open source status
Provide useful links for the implementation
No response