argilla-io / distilabel

Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
https://distilabel.argilla.io
Apache License 2.0
1.46k stars 113 forks source link

[FEATURE] add `llama-cpp` to `distilabel.embeddings` #954

Open davidberenstein1957 opened 3 weeks ago

davidberenstein1957 commented 3 weeks ago

Is your feature request related to a problem? Please describe. It would be nice to intergrate https://llama-cpp-python.readthedocs.io/en/stable/#embeddings because of the speed of default sentence-transformers

Describe the solution you'd like similar class but different framework

Describe alternatives you've considered na

Additional context There are various GGUF sentence transformers available on the hub https://huggingface.co/models?library=sentence-transformers,gguf&sort=trending

bikash119 commented 1 week ago

@davidberenstein1957 : Can I take a stab at it?