huggingface / text-generation-inference

Large Language Model Text Generation Inference
http://hf.co/docs/text-generation-inference
Apache License 2.0
9.11k stars 1.07k forks source link

Feature request: Add documentation and examples for adding additional API endpoints. #2321

Open michael-conrad opened 3 months ago

michael-conrad commented 3 months ago

Feature request

I would like to be able to use guidelines or other libraries that support constrained output with HF endpoints.

Reference: A guidance language for controlling large language models.

Motivation

I am wanting to use a library such as guidance for constrained generating using a HF inference endpoint so that we can use larger models that exceed our local computational capabilities.

Your contribution

Don't know where to start for adding an API endpoint to the existing TGI config.

ErikKaum commented 3 months ago

Hi @michael-conrad 🙌

We have structured generation support in TGI through outlines.

Would this solve your problem?

docs: https://huggingface.co/docs/text-generation-inference/basic_tutorials/using_guidance

michael-conrad commented 3 months ago

See also: https://github.com/guidance-ai/guidance/issues/952

ErikKaum commented 3 months ago

Okay thanks for pointing that out 👍

One thing that's probably good to clarify is that TGI and the Inference Endpoints are not dependent on each other. They are two separate things.

So in this case I think adding a custom Inference Handler (also linked in the guidance issue) is the way to go. TGI per se doesn't have a config to add new endpoints.

Does this make sense?