huggingface / huggingface_hub

The official Python client for the Huggingface Hub.
https://huggingface.co/docs/huggingface_hub
Apache License 2.0
2.12k stars 556 forks source link

[Inference] Support `stop` parameter in `text-generation` instead of `stop_sequences` #2473

Closed Wauplin closed 2 months ago

Wauplin commented 3 months ago

Fix https://github.com/huggingface/huggingface_hub/issues/2471 cc @sadra-barikbin.

This PR deprecates stop_sequences in favor of the stop parameter for the text_generation task.

Context: in both TGI and the text_generation specs the stop parameter is provide stop tokens to the model. However historically transformers was using the stop_sequences parameter which had been propagated to Inference API and InferenceClient. Since we are now TGI-first (i.e. even transformers models are served with TGI), let's just expose stop.

>>> from huggingface_hub import InferenceClient
>>> InferenceClient("gpt2").text_generation("The capital of France is", stop=["Republic"])
 the capital of the French Republic
HuggingFaceDocBuilderDev commented 3 months ago

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Wauplin commented 2 months ago

Thanks both for the reviews!