Open offgridtech opened 2 months ago
@nguyenhoangthuan99 Can you look into this:
tokenizer.cpp
?@offgridtech I am transferring this issue to cortex.cpp repo. We should be working on it, ETA 2 weeks
I think Mistral Nemo is the first model for us to do this pipeline automatically. To add new model support from hugging face
mistral-nemo
under cortexso
, cc @0xSage @dan-homebrew for helping me to create, my account doesn't have permission to do soMistral-nemo is supported now at cortexso. 10 quantization levels are available now. All models are created and uploaded automatically through CI.
Can try mistral-nemo
with cortex-nightly
Current behavior
I see a bunch of stuff on HuggingFace and llama.cpp Git about pre-tokenizers causing issues upon initial release of the quantizied Mistal Nemo model, but it seemed everything was cleared up over the last few days due to a llama.cpp update. What worked for other people didn't work for Jan. I've tried several quant versions, and it fails to start. Saw KoboldCPP and LMStudio say they made some updates, and it's fixed now. I'm guessing you all need to do the same. Thanks
More information here: https://github.com/ggerganov/llama.cpp/pull/8579 https://github.com/ggerganov/llama.cpp/pull/8604
Minimum reproduction step
It doesn't start. Other models like llama 3.1 start fine.
Expected behavior
The model starts
Screenshots / Logs
This log looks like it is the pre-tokenizer issue they were talking about.
Jan version
v0.5.2
In which operating systems have you tested?
Environment details
AppImage on Linux