ggerganov / llama.cpp

LLM inference in C/C++
MIT License
65.23k stars 9.35k forks source link

Feature Request: support embedding stella_en_400M and stella_en_400M.gguf conversion #9202

Open raymond-infinitecode opened 3 weeks ago

raymond-infinitecode commented 3 weeks ago

Prerequisites

Feature Description

Need help supporting stella_en_400M, observed that we have embedding model https://ollama.com/Losspost/stella_en_1.5b_v5

but there I couldn't convert stella_en_400M myself

Model Download: https://hf.rst.im/dunzhang/stella_en_400M_v5

D:\llama.cpp>python convert_hf_to_gguf.py d:/llama.cpp/stella_en_400M_v5 --outfile stella_en_400M.gguf --outtype q8_0 INFO:hf-to-gguf:Loading model: stella_en_400M_v5 ERROR:hf-to-gguf:Model NewModel is not supported

Motivation

To have better embedding model

Possible Implementation

No response

0xDEADFED5 commented 3 weeks ago

related models, i believe:

https://huggingface.co/Alibaba-NLP/gte-large-en-v1.5 https://huggingface.co/Alibaba-NLP/gte-base-en-v1.5