[Inference] Support `stop` parameter in `text-generation` instead of `stop_sequences`

Wauplin commented 3 months ago

Fix https://github.com/huggingface/huggingface_hub/issues/2471 cc @sadra-barikbin.

This PR deprecates stop_sequences in favor of the stop parameter for the text_generation task.

Context: in both TGI and the text_generation specs the stop parameter is provide stop tokens to the model. However historically transformers was using the stop_sequences parameter which had been propagated to Inference API and InferenceClient. Since we are now TGI-first (i.e. even transformers models are served with TGI), let's just expose stop.

>>> from huggingface_hub import InferenceClient
>>> InferenceClient("gpt2").text_generation("The capital of France is", stop=["Republic"])
 the capital of the French Republic

HuggingFaceDocBuilderDev commented 3 months ago

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Wauplin commented 2 months ago

Thanks both for the reviews!

huggingface / huggingface_hub

[Inference] Support `stop` parameter in `text-generation` instead of `stop_sequences` #2473