run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
36.94k stars 5.3k forks source link

[Bug]: Llamaindex cannot wrap online endpoint from AzureML #16950

Open azraimahadan opened 1 week ago

azraimahadan commented 1 week ago

Bug Description

LlamaIndex fails to wrap an AzureML endpoint. When attempting to use the AzureAICompletionsModelfrom LlamaIndex, an HttpResponseErrorwith the message Operation returned an invalid status 'Failed Dependency' is raised.

import os
from llama_index.llms.azure_inference import AzureAICompletionsModel

llm = AzureAICompletionsModel(
    endpoint='https://<>/score',
    credential='',
)
response = llm.complete("The sky is a beautiful blue and")
print(response)

Error traceback:

HttpResponseError: Operation returned an invalid status 'Failed Dependency'
Content: {"detail":"Not Found"}

Version

0.11.19

Steps to Reproduce

  1. Set up an AzureML endpoint with the URL: https://<>/score.

  2. Use the LlamaIndex library to create an AzureAICompletionsModel.

  3. Attempt to call the .complete() method with any prompt, such as "The sky is a beautiful blue and".

Relevant Logs/Tracbacks

File ~\AppData\Local\anaconda3\envs\llm\Lib\site-packages\azure\ai\inference\_patch.py:657, in ChatCompletionsClient.complete(self, body, messages, stream, frequency_penalty, presence_penalty, temperature, top_p, max_tokens, response_format, stop, tools, tool_choice, seed, model, model_extras, **kwargs)
    ...
    raise HttpResponseError(response=response)

HttpResponseError: Operation returned an invalid status 'Failed Dependency'
Content: {"detail":"Not Found"}
dosubot[bot] commented 1 week ago

Hey @azraimahadan! I'm here to help you with any bugs, questions, or contributions. Let's tackle this issue together!

The HttpResponseError with the message Operation returned an invalid status 'Failed Dependency' and content {"detail":"Not Found"} suggests that there might be an issue with the endpoint URL or the credentials used for accessing the AzureML service.

Here are a few things you can check and try:

  1. Endpoint URL: Ensure that the endpoint URL is correct and accessible. Double-check for any typos or errors in the URL.

  2. Credentials: Make sure that the credentials provided are correct and have the necessary permissions to access the AzureML endpoint. If you're using an API key, ensure it's valid. If using a credential class like DefaultAzureCredential, ensure it's properly configured.

  3. Model Availability: Verify that the model you are trying to access is deployed and available at the specified endpoint.

  4. Network Issues: Check for any network issues that might be preventing access to the endpoint.

  5. AzureML Configuration: Ensure that the AzureML endpoint is correctly configured to handle requests from LlamaIndex. This includes checking any required headers or authentication methods.

If these steps do not resolve the issue, you may need to look into the AzureML service logs for more detailed error information.

To continue talking to Dosu, mention @dosu.

logan-markewich commented 1 week ago

I'm not even entirely sure it's made to handle that? Feel free to check out the source code though, maybe you can spot the issue

logan-markewich commented 1 week ago

https://github.com/run-llama/llama_index/blob/023697e778eda2cedfd01dc7dcd6f3768868a010/llama-index-integrations/llms/llama-index-llms-azure-inference/llama_index/llms/azure_inference/base.py#L136

azraimahadan commented 1 week ago

I'm not even entirely sure it's made to handle that? Feel free to check out the source code though, maybe you can spot the issue

I've tried calling the endpoint locally and it works. But it failed when I called it with llamaindex's Azure's integrations. I suspect it does not support Azure ML endpoint. but maybe you can explain more if I missed anything.

logan-markewich commented 1 week ago

You are probably right. But I don't have access to azure to test these endpoints, so I pointed you towards the current source code in hopes you could make a PR :)