Add support for Hugging Face Inference API

kteppris commented 5 months ago

Is your feature request related to a problem? Please describe. Currently we are hosting Open Source Models like Mixtral-8x7B with the Hugging Face Inference Endpoint. With the new tgi 1.4 Version, this API works as the OpenAI API, therefore its possible to seamlessly transition from OpenAI Models to our Opensource Models without changing the code (see https://huggingface.co/blog/tgi-messages-api). Though, like described in in #360, using the Hugging Face Endpoint URL as Base URL for OpenAI Proxy gives me the CORS Error as descrbed in the issue.

Describe the solution you'd like While fixing the issue of CORS Errors, alternativly one may consider to actually add the functionallty to use the Huggingface Inference API directly, allowing the users to host their own Inference and make calls to their own server. This would be possible with the Chat Model, as well as with the Embeddings:

https://js.langchain.com/docs/integrations/llms/huggingface_inference

https://js.langchain.com/docs/integrations/text_embedding/hugging_face_inference

This empowers users to switch to Open Source Models and not share their data to thrid party, while having to run the models on their Edge Device with Ollama or os. Making it possible in near Future to laverage Open Source Models with a small Home Server.

Describe alternatives you've considered Alternatively it would be possible to only offer the support of the models that actually seamlisly integrate with the langchain OpenAI component as described here (with langchain python, but should work the same with js): https://huggingface.co/blog/tgi-messages-api#integrate-with-langchain-and-llamaindex

iukea1 commented 5 months ago

This would be sick!

nfroseth commented 5 months ago

Looking forward to this!

logancyang / obsidian-copilot

Add support for Hugging Face Inference API #374