jackmpcollins / magentic

Seamlessly integrate LLMs as Python functions
https://magentic.dev/
MIT License
2.02k stars 98 forks source link

Feature Request: Implement HuggingFace TGI API client #278

Open michael-conrad opened 3 months ago

michael-conrad commented 3 months ago

HuggingFace has a standard TGI/Docker type container for serving LLM requests.

It would be useful to take advantage of HuggingFace features for TGI generation.

jackmpcollins commented 3 months ago

@michael-conrad

Text Generation Inference (TGI) now supports the Messages API, which is fully compatible with the OpenAI Chat Completion API. This feature is available starting from version 1.4.0. You can use OpenAI’s client libraries or third-party libraries expecting OpenAI schema to interact with TGI’s Messages API. Below are some examples of how to utilize this compatibility.

from https://huggingface.co/docs/text-generation-inference/messages_api

So you should be able to use magentic with HuggingFace TGI by setting the base_url param in OpenaiChatModel. See https://magentic.dev/configuration/

Please let me know here if this works for you or if there are any issues. If it works I'd be happy to accept a PR to add a section to the docs about it with useful links and any additional setup steps needed.