Open Manouchehri opened 3 months ago
this isn't a bug on litellm.
Just ran a base test to confirm this works
from litellm import completion
response = completion(
model="azure/chatgpt-v-2",
messages=messages,
# api_key="my-fake-ad-token",
azure_ad_token=os.getenv("AZURE_API_KEY"),
)
print(response)
@Manouchehri please reopen if there's a litellm specific issue here
Pretty sure embedding != completion.. Leaving this ticket closed as I’m not going to troubleshot it any further, I already tried to demonstrate the exact code you provided there, in #4861 to test out that.
good catch - tested with embedding too, and it worked.
@krrishdholakia Can you confirm if AZURE_API_KEY
is an API key or Microsoft Entra token? Based on the name, I'm under the impression it's an API key.
Looking at 77ffee4e2ea347b7ba0964ba105635f7772d0dff and 65705fde2558b650de81708bad5f7e5c262036f1, I would expect both to fail 100% of the time. They appear to be proving the opposite of what should be happening.
The only ways those two commits would be passing is if:
A. api-key
is being instead instead of Authorization
by LiteLLM (wrong)
B. ~Microsoft's docs are wrong and they accept api-key
and Authorization
interchangeably (possible but I can confirm it if you'd really like)~ <- not possible, confirmed in https://github.com/BerriAI/litellm/issues/4859#issuecomment-2249024203
C. AZURE_API_KEY
is a Microsoft Entra token and not an API key (which in this case, I would recommend maybe naming things a little different to avoid confusion?)
If it's situation A, then you have written unit tests that only pass if LiteLLM keeps the bug.
https://learn.microsoft.com/en-us/azure/ai-services/openai/reference#authentication
See the output below that proves situation B is not correct. https://github.com/BerriAI/litellm/issues/4859#issuecomment-2249013269
# this should ONLY work if we are NOT using an Microsoft Entra / AD token
curl "$AZURE_OPENAI_ENDPOINT/openai/deployments/text-embedding-ada-002/embeddings?api-version=2024-06-01" \
-H "Content-Type: application/json" \
-H "api-key: $AZURE_OPENAI_API_KEY" \
-d '{
"input": "Hello"
}' -s | jq | head
Output:
{
"object": "list",
"data": [
{
"object": "embedding",
"index": 0,
"embedding": [
-0.021819502,
-0.007147768,
-0.028617017,
Input:
# this should NOT work if we are NOT using an Microsoft Entra / AD token
curl "$AZURE_OPENAI_ENDPOINT/openai/deployments/text-embedding-ada-002/embeddings?api-version=2024-06-01" \
-H "Content-Type: application/json" \
-H "Authorization: $AZURE_OPENAI_API_KEY" \
-d '{
"input": "Hello"
}'
Output:
{ "statusCode": 401, "message": "Unauthorized. Access token is missing, invalid, audience is incorrect (https://cognitiveservices.azure.com), or have expired." }
Input:
# this should NOT work if we are NOT using an Microsoft Entra / AD token
curl "$AZURE_OPENAI_ENDPOINT/openai/deployments/text-embedding-ada-002/embeddings?api-version=2024-06-01" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $AZURE_OPENAI_API_KEY" \
-d '{
"input": "Hello"
}'
Output:
{ "statusCode": 401, "message": "Unauthorized. Access token is missing, invalid, audience is incorrect (https://cognitiveservices.azure.com), or have expired." }
Either AZURE_API_KEY
is actually an Microsoft Entra token and not an API key, or api-key
is being instead instead of Authorization by LiteLLM.
It seems obvious to me that the api-key
header is being used for some reason instead of Authorization
, unless the logging is wrong?
WAIT, I see the bug!! Whenever you have AZURE_API_KEY
set, it will always override the Azure AD token flow.
So you aren't testing Azure AD at all, that's why you were able to pass https://github.com/BerriAI/litellm/commit/77ffee4e2ea347b7ba0964ba105635f7772d0dff and https://github.com/BerriAI/litellm/commit/65705fde2558b650de81708bad5f7e5c262036f1.
got it - that's helpful - i'll update my test and try to repro
I already updated it for you in https://github.com/BerriAI/litellm/pull/4866
Interesting - i definitely see api-key
being passed when api_key= is set
Headers({'host': 'openai-gpt-4-test-v-1.openai.azure.com', 'accept-encoding': 'gzip, deflate', 'connection': 'keep-alive', 'accept': 'application/json', 'content-type': 'application/json', 'user-agent': 'AzureOpenAI/Python 1.34.0', 'x-stainless-lang': 'python', 'x-stainless-package-version': '1.34.0', 'x-stainless-os': 'MacOS', 'x-stainless-arch': 'arm64', 'x-stainless-runtime': 'CPython', 'x-stainless-runtime-version': '3.11.4', 'authorization': '[secure]', 'x-stainless-async': 'false', 'api-key': 'my-fake-ad-token', 'content-length': '103'})
this seems to make requests work for azure
but when i pass via azure_ad_token, even if auth contains the right credentials - the request fails with a 401
just tested azure openai directly. this looks something specific to how they expect the call to happen.
When i set azure_ad_token="my-api-key", i get a 401 error, but if i use a generated ad token it works even though both are being sent as bearer tokens
but when i pass via azure_ad_token, even if auth contains the right credentials - the request fails with a 401
Are you passing an AD token or API key? If you’re passing an API key to the Azure AD token field, that’s expected to fail with a 401.
If you have AZURE_API_KEY set in your environment, LiteLLM uses that instead of the azure_ad_token. (This is what was causing my embedding unit test to fail on CircleCI.)
When i set azure_ad_token="my-api-key", i get a 401 error, but if i use a generated ad token it works even though both are being sent as bearer tokens
Yes, this sounds exactly like it should be. This is the correct and expected behaviour.
For the unit test, 3cd3491920fa64c9e7c9635478f05615d132cafb solved it. Leaving this ticket open so we can decide later if this is a bug, or just an undocumented behaviour.
https://www.ai.moda/en/blog/enforcing-azure-ad-on-openai
^ This is how I created and enforced Azure AD on an Azure OpenAI resource.
What happened?
https://github.com/BerriAI/litellm/blob/d9539e518e2d4d82ea2b6ac737de19147790e5ea/litellm/tests/test_embedding.py#L210-L216
https://learn.microsoft.com/en-us/azure/ai-services/openai/reference
This should result in a header like:
But instead, I see this in the unit test output..
https://app.circleci.com/pipelines/github/BerriAI/litellm/12911/workflows/efbcc19e-278a-4ca1-ac19-b4e43ea4a1ae/jobs/31448?invite=true#step-110-130594_108
I am very confused since this works just fine in proxy mode.
Relevant log output
Twitter / LinkedIn details
https://twitter.com/DaveManouchehri