v0.22.0: Chat completion, inference types and hub mixins!
Discuss about the release in our Community Tab. Feedback is welcome!! 🤗
✨ InferenceClient
Support for inference tools continues to improve in huggingface_hub. At the menu in this release? A new chat_completion API and fully typed inputs/outputs!
Chat-completion API!
A long-awaited API has just landed in huggingface_hub! InferenceClient.chat_completion follows most of OpenAI's API, making it much easier to integrate with existing tools.
Technically speaking it uses the same backend as the text-generation task but requires a preprocessing step to format the list of messages into a single text prompt. The chat template is rendered server-side when models are powered by TGI, which is the case for most LLMs: Llama, Zephyr, Mistral, Gemma, etc. Otherwise, the templating happens client-side which requires minijinja package to be installed. We are actively working on bridging this gap, aiming at rendering all templates server-side in the future.
>>> from huggingface_hub import InferenceClient
>>> messages = [{"role": "user", "content": "What is the capital of France?"}]
>>> client = InferenceClient("HuggingFaceH4/zephyr-7b-beta")
Batch completion
>>> client.chat_completion(messages, max_tokens=100)
ChatCompletionOutput(
choices=[
ChatCompletionOutputChoice(
finish_reason='eos_token',
index=0,
message=ChatCompletionOutputChoiceMessage(
content='The capital of France is Paris. The official name of the city is "Ville de Paris" (City of Paris) and the name of the country's governing body, which is located in Paris, is "La République française" (The French Republic). \nI hope that helps! Let me know if you need any further information.'
)
)
],
created=1710498360
)
Stream new tokens one by one
>>> for token in client.chat_completion(messages, max_tokens=10, stream=True):
</tr></table>
Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.
Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- `@dependabot show ignore conditions` will show all of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
Bumps huggingface-hub from 0.20.3 to 0.22.2.
Release notes
Sourced from huggingface-hub's releases.
... (truncated)
Commits
06f48e6
Release: v0.22.20c7986e
Fix URL when uploading to proxy (#2167)0dd879b
Fix proxy if dynamic endpoint4e738e2
Fix HF_ENDPOINT not handled correctly (#2155)4ecdbee
Release: v0.22.145881a1
Merge branch 'main' into v0.22-releasea9453d9
Fix ModelHubMixin when class is a dataclass (#2159)31261db
Release: v0.22.09e13b83
Fix use other chat completion providers (#2153)20d8491
Release: v0.22.0.rc1Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting
@dependabot rebase
.Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show