Closed michaelfeil closed 2 months ago
Got some suggestions for frameworks for self-hosted serving of llm and related.
Jina https://github.com/jina-ai/clip-as-service (Apache)
My own project: infinity ( just add it if you like it) https://github.com/michaelfeil/infinity (MIT) https://github.com/huggingface/text-embeddings-inference (no opensource Licence)
Huggingface/ TGI https://github.com/huggingface/text-generation-inference (no longer opensource licence after 1.0) Nvidia-TensorRTLLM (Apache2.0)
Thanks @michaelfeil! Will add soon. There are some other papers I noticed and will add over the weekend.
merged. Thanks @michaelfeil!
Got some suggestions for frameworks for self-hosted serving of llm and related.
Embeddings from OpenAI clip.
Jina https://github.com/jina-ai/clip-as-service (Apache)
Text-embeddings:
My own project: infinity ( just add it if you like it) https://github.com/michaelfeil/infinity (MIT) https://github.com/huggingface/text-embeddings-inference (no opensource Licence)
LLM Inference as a service:
Huggingface/ TGI https://github.com/huggingface/text-generation-inference (no longer opensource licence after 1.0) Nvidia-TensorRTLLM (Apache2.0)