AzureMLOnlineEndpoint not working, 424 error, but same url and api key works with standard http

abelsan commented 7 months ago

Checked other resources

[X] I added a very descriptive title to this issue.
[X] I searched the LangChain documentation with the integrated search.
[X] I used the GitHub search to find a similar question and didn't find it.
[X] I am sure that this is a bug in LangChain rather than my code.

Example Code

The following code:

from langchain_community.llms.azureml_endpoint import AzureMLOnlineEndpoint
from langchain_community.llms.azureml_endpoint import AzureMLEndpointApiType
from langchain_community.llms.azureml_endpoint import DollyContentFormatter

# ------------------------------------
#    Allow Self Signed Https code
# ------------------------------------

llm = AzureMLOnlineEndpoint(
    endpoint_url="https://myproject.eastus2.inference.ml.azure.com/score",
    endpoint_api_type=AzureMLEndpointApiType.realtime,
    endpoint_api_key="my-key",
    content_formatter=DollyContentFormatter(),
    model_kwargs={"temperature": 0.8, "max_tokens": 300},
)
response = llm.invoke("Write me a song about sparkling water:")
response

Error Message and Stack Trace (if applicable)

Traceback (most recent call last): File "/Users/abel/Desktop/source/lang/dolly.py", line 24, in response = llm.invoke("Write me a song about sparkling water:") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/langchain_core/language_models/llms.py", line 235, in invoke self.generate_prompt( File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/langchain_core/language_models/llms.py", line 530, in generate_prompt return self.generate(prompt_strings, stop=stop, callbacks=callbacks, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/langchain_core/language_models/llms.py", line 703, in generate output = self._generate_helper( ^^^^^^^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/langchain_core/language_models/llms.py", line 567, in _generate_helper raise e File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/langchain_core/language_models/llms.py", line 554, in _generate_helper self._generate( File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/langchain_community/llms/azureml_endpoint.py", line 489, in _generate response_payload = self.http_client.call( ^^^^^^^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/langchain_community/llms/azureml_endpoint.py", line 50, in call response = urllib.request.urlopen(req, timeout=kwargs.get("timeout", 50)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/urllib/request.py", line 216, in urlopen return opener.open(url, data, timeout) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/urllib/request.py", line 525, in open response = meth(req, response) ^^^^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/urllib/request.py", line 634, in http_response response = self.parent.error( ^^^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/urllib/request.py", line 563, in error return self._call_chain(args) ^^^^^^^^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/urllib/request.py", line 496, in _call_chain result = func(*args) ^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/urllib/request.py", line 643, in http_error_default raise HTTPError(req.full_url, code, msg, hdrs, fp) urllib.error.HTTPError: HTTP Error 424: Failed Dependency

Description

AzureMLOnlineEndpoint not working, 424 error, but same url and api key works with standard http. The working plain http is below:

import urllib.request
import json
import os
import ssl

def allowSelfSignedHttps(allowed):
    # bypass the server certificate verification on client side
    if allowed and not os.environ.get('PYTHONHTTPSVERIFY', '') and getattr(ssl, '_create_unverified_context', None):
        ssl._create_default_https_context = ssl._create_unverified_context

allowSelfSignedHttps(True) # this line is needed if you use self-signed certificate in your scoring service.

# Request data goes here
# The example below assumes JSON formatting which may be updated
# depending on the format your endpoint expects.
# More information can be found here:
# https://docs.microsoft.com/azure/machine-learning/how-to-deploy-advanced-entry-script
data =  {
  "input_data": [
    "Write me a super short song about sparkling water"
  ],
  "params": {
    "top_p": 0.9,
    "temperature": 0.2,
    "max_new_tokens": 50,
    "do_sample": True,
    "return_full_text": True
  }
}

body = str.encode(json.dumps(data))

url = 'https://myProject.eastus2.inference.ml.azure.com/score'
# Replace this with the primary/secondary key or AMLToken for the endpoint
api_key = 'my-key'
if not api_key:
    raise Exception("A key should be provided to invoke the endpoint")

# The azureml-model-deployment header will force the request to go to a specific deployment.
# Remove this header to have the request observe the endpoint traffic rules
headers = {'Content-Type':'application/json', 'Authorization':('Bearer '+ api_key), 'azureml-model-deployment': 'databricks-dolly-v2-12b-15' }

req = urllib.request.Request(url, body, headers)

try:
    response = urllib.request.urlopen(req)

    result = response.read()
    print(result)
except urllib.error.HTTPError as error:
    print("The request failed with status code: " + str(error.code))

    # Print the headers - they include the requert ID and the timestamp, which are useful for debugging the failure
    print(error.info())
    print(error.read().decode("utf8", 'ignore'))

System Info

System Information

OS: Darwin OS Version: Darwin Kernel Version 23.3.0: Wed Dec 20 21:33:31 PST 2023; root:xnu-10002.81.5~7/RELEASE_ARM64_T8112 Python Version: 3.11.4 (v3.11.4:d2340ef257, Jun 6 2023, 19:15:51) [Clang 13.0.0 (clang-1300.0.29.30)]

Package Information

langchain_core: 0.1.17 langchain: 0.1.4 langchain_community: 0.0.16

Packages not installed (Not Necessarily a Problem)

The following packages were not found:

langgraph langserve

dosubot[bot] commented 7 months ago

🤖

Hey again @abelsan! Nice to see you diving deep into the bits and bytes. Looking forward to tackling this together. 🚀

The 424 error code typically indicates a "Failed Dependency". This means that the request failed due to the failure of a previous request. In the context of AzureML, this could mean that the model you're trying to access is not available or not properly deployed.

However, if the same URL and API key work with standard HTTP, it's possible that the issue lies in the way the request is being made in the LangChain framework.

Looking at the AzureMLEndpointClient class, the call method is responsible for making the request to the AzureML endpoint. It constructs the headers and makes a request using urllib.request.urlopen.

headers = {
    "Content-Type": "application/json",
    "Authorization": ("Bearer " + self.endpoint_api_key),
}
if self.deployment_name != "":
    headers["azureml-model-deployment"] = self.deployment_name

req = urllib.request.Request(self.endpoint_url, body, headers)
response = urllib.request.urlopen(req, timeout=kwargs.get("timeout", 50))
result = response.read()
return result

One potential issue could be the timeout parameter. If the request takes longer than the specified timeout (default is 50 seconds), it will fail. You could try increasing the timeout to see if that resolves the issue.

Another potential issue could be the azureml-model-deployment header. This header is only added if self.deployment_name is not an empty string. If the deployment name is incorrect or not properly configured in AzureML, this could cause the request to fail.

Lastly, the issue could be with the request body. The body is passed as is to the urllib.request.Request method. If the body is not properly formatted or does not match the expected input for the AzureML model, this could also cause the request to fail.

Without more specific information about the error message or the configuration of the AzureML model, it's difficult to provide a more precise answer. If you could provide more details about the error message or the configuration of your AzureML model, it would be helpful in diagnosing the issue.

Sources

libs/langchain/langchain/llms/azureml_endpoint.py

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

abelsan commented 7 months ago

@dosu-bot I tried the suggestions but did not get anywhere. Experimenting, I tried a Llama model and I had the same problems. However, experimenting further, I came across "AzureMLChatOnlineEndpoint", that worked - see code below.

from langchain.schema import HumanMessage
from langchain_community.chat_models.azureml_endpoint import LlamaContentFormatter
from langchain_community.chat_models.azureml_endpoint import AzureMLChatOnlineEndpoint
from langchain_community.chat_models.azureml_endpoint import (
    AzureMLEndpointApiType,
    LlamaChatContentFormatter,
)
import ssl
import os

def allowSelfSignedHttps(allowed):
    # bypass the server certificate verification on client side
    if allowed and not os.environ.get('PYTHONHTTPSVERIFY', '') and getattr(ssl, '_create_unverified_context', None):
        ssl._create_default_https_context = ssl._create_unverified_context
allowSelfSignedHttps(True) # this line is needed if you use self-signed certificate in your scoring service.

chat = AzureMLChatOnlineEndpoint(
    endpoint_url="https://smartypants-hengu.eastus2.inference.ml.azure.com/score",
    endpoint_api_type=AzureMLEndpointApiType.realtime,
    endpoint_api_key="lLeVdOS1nZ2cDghpVJRnkeTwtHO3QI9u",
    content_formatter=LlamaChatContentFormatter(),
)
response = chat.invoke(
    [HumanMessage(content="How are you?")]
)
print(response)

Is there something similar for "Dolly"? Something like "DollyChatContentFormatter"?

dosubot[bot] commented 7 months ago

🤖

Based on the information you've provided, it seems like you're encountering an HTTP Error 424 when using the AzureMLOnlineEndpoint function with the DollyContentFormatter. This error typically indicates a failed dependency, which could be due to a variety of reasons such as incorrect API key, incorrect URL, or incorrect request format.

Unfortunately, there is no DollyChatContentFormatter or a similar function in LangChain. The available content formatters in LangChain are:

OSSContentFormatter
HFContentFormatter
DollyContentFormatter
CustomFormatter (a custom content formatter defined in the test_custom_formatter function)

If you're using the DollyContentFormatter and encountering this error, I would recommend checking your API key, URL, and request format. If the error persists, you might need to create a custom content formatter that suits your needs, similar to the CustomFormatter defined in the test_custom_formatter function. You can find more information about this in the test_azureml_endpoint.py file.

As for the AzureMLOnlineEndpoint function, it is used to interact with Azure ML Online Endpoints. It makes requests to a specified Azure ML endpoint and processes the responses. The class uses a content_formatter to format the request payload and the response payload. The content_formatter should be an instance of a class that inherits from ContentFormatterBase and implements the format_request_payload and format_response_payload methods. You can find more details about this in the azureml_endpoint.py file.

I hope this helps! If you have any more questions or need further clarification, feel free to ask.

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

abelsan commented 7 months ago

I suspect it's something pretty simply but the bot does not have the info for the answer. I'll close the thread.

PrahladTnP commented 5 months ago

Hey @abelsan, I'm encountering the same issue. Did you find a solution for it?

fsimonAramis commented 4 months ago

same issue, any solution ?

weissenbacherpwc commented 1 month ago

same issue here

langchain-ai / langchain