Azure / azure-sdk-for-net

This repository is for active development of the Azure SDK for .NET. For consumers of the SDK we recommend visiting our public developer docs at https://learn.microsoft.com/dotnet/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-net.
MIT License
5.18k stars 4.54k forks source link

[FEATURE REQ] Azure.AI.OpenAI: Support HuggingFace chat completion streaming API #44135

Open dluc opened 2 months ago

dluc commented 2 months ago

Library name

Azure.AI.OpenAI

Please describe the feature.

HuggingFace chat completion streaming API is designed to imitate OpenAI streaming response. However, due to a couple of minor differences, when pointing Azure SDK OpenAIClient to HuggingFace, method GetCompletionsStreaming hangs indefinitely:

  1. HF doesn't terminate a stream with [DONE], so SseAsyncEnumerator while loop never breaks.

  2. HF doesn't support NucleusSamplingFactor 0.0, and returns an error {"error":"Input validation error: `top_p` must be > 0.0 and < 1.0","error_type":"validation"}. Unfortunately the response status code is 200 OK so it doesn't trigger any exception. Users could workaround this issue by passing 0.01 instead, but there's no exception suggesting to change the value.

It would be great if Azure AI SDK had a way to workaround these issues, for instance:

See also https://github.com/huggingface/text-generation-inference/issues/1896 and https://github.com/microsoft/kernel-memory/issues/388

github-actions[bot] commented 2 months ago

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @jpalvarezl @trrwilson.