BerriAI / litellm

Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
https://docs.litellm.ai/docs/
Other
12.55k stars 1.46k forks source link

[Bug]: Azure OpenAI attempts to reuse Azure AD token after it expires #4417

Open Manouchehri opened 3 months ago

Manouchehri commented 3 months ago

What happened?

So I think this is because the OpenAI client is cached somewhere. Basically, after an hour, the Azure AD token is still used and I get a 401. I don't think this is a bug in the OIDC side of things.

Relevant log output

No response

Twitter / LinkedIn details

https://www.linkedin.com/in/davidmanouchehri/

krrishdholakia commented 3 months ago

can you share the error / any way to repro this @Manouchehri

Manouchehri commented 3 months ago

You have to have the proxy running for over an hour, and be using OIDC for Azure OpenAI auth.

pamelafox commented 2 months ago

It looks like litellm is currently accepting a token, could it be extended to support the token_provider instead? The OpenAI client will use that to dynamically fetch a new token, so that you shouldnt have that issue.

krrishdholakia commented 2 months ago

hey @pamelafox sure - how would you want to do that? (mock code would be great)

Manouchehri commented 2 months ago

https://github.com/langchain-ai/langchain/issues/14069#issue-2018694209

https://github.com/Azure/azure-sdk-for-python/blob/6d066c6847cf65c2058f30afad09da853d69dd83/sdk/identity/azure-identity/azure/identity/_bearer_token_provider.py#L41-L44

It looks like it might just be a function I need to create that returns a string.

krrishdholakia commented 2 months ago

hey @Manouchehri this seems similar to how we had to handle refresh token for google - https://github.com/BerriAI/litellm/blob/88eb25da5c536ef3f95759acfa1db84a7b260c86/litellm/llms/vertex_httpx.py#L780

maybe we can have a similar helper function in azure.py?

pamelafox commented 2 months ago

Hm, is there a reason you can't let developers create the token_provider themselves? That's what I do when using Azure OpenAI typically, I create my credential with DefaultAzureCredential, call get_bearer_token_provider, and pass that into Azure OpenAI (like my code in https://blog.pamelafox.org/2023/09/best-practices-for-openai-chat-apps-go.html).

I guess you could mimic that yourself, if that's what works better for litellm.

Generally, though, I'd like to be able to use my own credential object, since I don't always have the environment variables setup the way your code currently seems to expect.

Manouchehri commented 2 months ago

Hm, is there a reason you can't let developers create the token_provider themselves?

Because people like me are using LiteLLM as an OpenAI proxy service. We can't be asking users to write a token_provider function in the yaml config.

Generally, though, I'd like to be able to use my own credential object, since I don't always have the environment variables setup the way your code currently seems to expect.

What is your environment variables setup? We/I intentionally made the OIDC auth extremely generic so that we can support using any OIDC token super easily with any service that supports OIDC auth (right now just supporting Bedrock and Azure OpenAI).

pamelafox commented 2 months ago

In production, I sometimes have the client ID in an arbitrarily named env variable, like AZURE_MANAGED_IDENTITY_ID, and then I pass that in to ManagedIdentityCredential from azure-identity.

But maybe I would have to use AZURE_CLIENT_ID if I were to use litellm? We're not using it yet, I am evaluating it for use in various Azure samples, and managed identity support is one of the requirements.

Manouchehri commented 2 months ago

AZURE_CLIENT_ID,AZURE_TENANT_ID, and AZURE_AUTHORITY_HOST, are the standards set by Microsoft, not us.

https://azure.github.io/azure-workload-identity/docs/quick-start.html#7-deploy-workload

https://learn.microsoft.com/en-us/dotnet/api/azure.identity.environmentcredential?view=azure-dotnet

Is there a reason you want to diverge from the industry standards? Adding this as a configuration option is not technically hard, it's just going to be confusing for almost everyone else (who should be using the standards).

pamelafox commented 2 months ago

The target audience for my suggestion are developers currently using code like this with Azure services:

credential = DefaultAzureCredential(
   managed_identity_client_id=os.getenv("OTHER_CLIENT_ID"))
client = AzureOpenAI(get_bearer_token_provider(
    azure_credential, "https://cognitiveservices.azure.com/.default")

I sometimes use code like this when I am using "user-assigned managed identities", where it's possible to have multiple user-assigned managed identities associated with my host, and I need to specify the client ID of the particular identity that the host should use for authenticating with Azure OpenAI.

I don't know how frequently developers are in the situation that I describe, it may be a rare case, so you may not decide it's necessary to accomodate.

RigautAntoine commented 2 months ago

Are there any documentation or code samples on how to pass this azure_ad_token to the proxy? There's nothing on the site.

ishaan-jaff commented 2 months ago

cc @Manouchehri could you share how you do this ?

I can add this to our docs

krrishdholakia commented 2 months ago

hey @RigautAntoine right here - https://docs.litellm.ai/docs/providers/azure#authentication-with-azure-active-directory-tokens-microsoft-entra-id

model_list:
  - model_name: good-azure-model
    litellm_params:
      model: azure/chatgpt-v-2
      azure_ad_token: ""
      api_base: os.environ/AZURE_API_BASE
Manouchehri commented 2 months ago

And for OIDC, it's like this if you're running on Google Cloud Run:

model_list:
  - model_name: gpt-4-turbo-2024-04-09
    litellm_params:
      model: azure/gpt-4-turbo-2024-04-09
      api_version: "2024-06-01"
      azure_ad_token: "oidc/google/https://example.com"
      api_base: "https://removed.openai.azure.com"
      seed: 1337
      max_tokens: 4096
    model_info:
      base_model: azure/gpt-4-turbo-2024-04-09
RigautAntoine commented 2 months ago

Thank you @Manouchehri & @krrishdholakia. My use case relies on using a POST request to https://login.microsoftonline.com/<tenant-id>/oauth2/v2.0/token to get a fresh 1-hour token. Could this token be provided to the proxy server by the client, or there a way to set up a refresh function proxy-side?

Manouchehri commented 2 months ago

My use case relies on using a POST request to https://login.microsoftonline.com/<tenant-id>/oauth2/v2.0/token to get a fresh 1-hour token.

We already do this for you if you use the oidc/* feature. https://github.com/BerriAI/litellm/blob/e719986c81443264a03b9114ebbf14159f41b6d3/litellm/llms/azure.py#L332-L385

a way to set up a refresh function proxy-side

This is what this ticket is for fixing, right now it does not refresh. =)

Manouchehri commented 2 months ago

Could I have help with this one? There's over 100 times that azure_ad_token is referenced, and I don't understand why there's so much code duplication around the clients.

krrishdholakia commented 2 months ago

hey @Manouchehri could you add some documentation re: oidc? specifically:

Manouchehri commented 2 months ago

hey @Manouchehri could you add some documentation

That's #3946. =)

Manouchehri commented 2 months ago

I've temporarily done a terrible patch locally that masks the problem. It doesn't fix this problem at all, but it allows me to get other things done at work that are needed.

My Dockerfile.database end:

...
# Create a startup script with timeout
RUN echo '#!/bin/sh\nexec timeout 3500 litellm --port 4000 "$@"' > /app/startup.sh && chmod +x /app/startup.sh

# Set the entry point to the script
ENTRYPOINT ["/app/startup.sh"]
Manouchehri commented 2 months ago

@krrishdholakia Could you help with this ticket now that #4836 is complete? :)

Manouchehri commented 1 month ago

Bump on this? #5131 has me quite concerned, as folks are (understandably) now trying to fragment apart the nicely unified OIDC auth because of this unresolved ticket. =/

Manouchehri commented 1 month ago

Bumping again. :( Still a critical issue on my end, it causes downtime.

krrishdholakia commented 1 month ago

Hey @Manouchehri missed this. I'll pick this up tomorrow

Manouchehri commented 1 month ago

Thank you so much! Let me know if you need more info.