Open raymond-infinitecode opened 2 months ago
related models, i believe:
https://huggingface.co/Alibaba-NLP/gte-large-en-v1.5 https://huggingface.co/Alibaba-NLP/gte-base-en-v1.5
Please! These are the SoTA for performance:resource ratio! @ggerganov we want to be able to have robust local retrieval models!
Has anyone had any luck creating a GGUF version of stella_en_400M_v5? I've had a go but wasn't successful.
stella_en_400M_v5 is derived from GTE based model which is not officially supported by llama.cpp / ollama, probably that's why no people manage to support that.
Same issue persists for Alibaba-NLP/gte-multilingual-base. Any updates on this ? @ggerganov
Prerequisites
Feature Description
Need help supporting stella_en_400M, observed that we have embedding model https://ollama.com/Losspost/stella_en_1.5b_v5
but there I couldn't convert stella_en_400M myself
Model Download: https://hf.rst.im/dunzhang/stella_en_400M_v5
D:\llama.cpp>python convert_hf_to_gguf.py d:/llama.cpp/stella_en_400M_v5 --outfile stella_en_400M.gguf --outtype q8_0 INFO:hf-to-gguf:Loading model: stella_en_400M_v5 ERROR:hf-to-gguf:Model NewModel is not supported
Motivation
To have better embedding model
Possible Implementation
No response