janhq / models

MIT License
4 stars 2 forks source link

model: Mistral Nemo #19

Open offgridtech opened 2 months ago

offgridtech commented 2 months ago

Current behavior

I see a bunch of stuff on HuggingFace and llama.cpp Git about pre-tokenizers causing issues upon initial release of the quantizied Mistal Nemo model, but it seemed everything was cleared up over the last few days due to a llama.cpp update. What worked for other people didn't work for Jan. I've tried several quant versions, and it fails to start. Saw KoboldCPP and LMStudio say they made some updates, and it's fixed now. I'm guessing you all need to do the same. Thanks

More information here: https://github.com/ggerganov/llama.cpp/pull/8579 https://github.com/ggerganov/llama.cpp/pull/8604

Minimum reproduction step

It doesn't start. Other models like llama 3.1 start fine.

Expected behavior

The model starts

Screenshots / Logs

image

This log looks like it is the pre-tokenizer issue they were talking about.

Jan version

v0.5.2

In which operating systems have you tested?

Environment details

AppImage on Linux

dan-homebrew commented 1 month ago

@nguyenhoangthuan99 Can you look into this:

nguyenhoangthuan99 commented 1 month ago
dan-homebrew commented 3 weeks ago

@offgridtech I am transferring this issue to cortex.cpp repo. We should be working on it, ETA 2 weeks

nguyenhoangthuan99 commented 1 week ago

I think Mistral Nemo is the first model for us to do this pipeline automatically. To add new model support from hugging face

nguyenhoangthuan99 commented 1 week ago

Mistral-nemo is supported now at cortexso. 10 quantization levels are available now. All models are created and uploaded automatically through CI. Image

Can try mistral-nemo with cortex-nightly Image