huggingface / huggingface_hub

The official Python client for the Huggingface Hub.
https://huggingface.co/docs/huggingface_hub
Apache License 2.0
1.84k stars 473 forks source link

Truly `openai` drop-in replacement for chat completion #2369

Closed Wauplin closed 1 day ago

Wauplin commented 1 week ago

_Originally from @philschmid on slack (private):_

Being openai compatible for serving has become a standard so its awesome that TGI is. This also means what we can use openai sdk and it allows us to communicate that you can basically switch from closed source to open source with changing 1 variable (url). A lot of people are familiar with the openai already so the closer you are the easier it is for them to understand and switch. I get that when we have new features we cannot use openai but it might have been cool that we are in parity for everything existing, e.g.

from huggingface_hub import InferenceClient

client = InferenceClient(
  base_url='http://huggingface.co/api/integrations/dgx',
  api_key='hf_xx',

)
resp = client.chat.completions.create(
    model="meta-llama/Meta-Llama-3-8B-Instruct",
    messages=[
        {"role": "system", "content": "You are a helpful assistant." },
        {"role": "user", "content": "Count to 50"}
    ],
    stream=False,
    max_tokens=1024
)

print(resp.choices[0].message.content)

or

from huggingface_hub import InferenceClient

client = InferenceClient()

resp = client.chat.completions.create(
    model="meta-llama/Meta-Llama-3-8B-Instruct",
    messages=[
        {"role": "system", "content": "You are a helpful assistant." },
        {"role": "user", "content": "Count to 50"}
    ],
    stream=False,
    max_tokens=1024
)

print(resp.choices[0].message.content)

Note: to do that we'll need to:

lappemic commented 1 week ago

I would love to work on this @Wauplin.

To clarify:

Wauplin commented 1 week ago

Hi @lappemic, thanks for proposing yourself! For now I've created the issue but I'm not sure if we want and how we want to proceed. I've started to test things locally to see how the changes would looks like (API-wise but also internally). I think we will introduce changes only for openai compatibility but keep the current naming for most/all existing tasks without breaking changes. Changes will happen in both _client.py and the async client (automatically).

lappemic commented 1 week ago

Thanks for the feedback.

Ok, let me know if you decided on how to proceed and if you would like some help!

Wauplin commented 2 days ago

In the end I opened https://github.com/huggingface/huggingface_hub/pull/2384 with the corresponding changes. Still need to be discussed on new args