[BUG] Azure OpenAI integration fails to propagate HTTP errors

myrosia commented 1 month ago

Describe the bug

I am using langchain4j-azure-open-ai in my project. I misconfigured some of the Azure deployment. As a result my chat client started returning the following text to the user: Status code 403, "{"error":{"code":"403","message": "Access denied due to Virtual Network/Firewall rules."}}"

This is obviously nonsensical to show to the user and it is difficult to catch in my code because I'd have to pattern match the output. The implementation should throw an exception instead.

The underlying cause is the exception handling code in AzureOpenAiChatModel that swallows the exception and copies the exception message into the response passed to the user: https://github.com/langchain4j/langchain4j/blob/9d42a15ca8fb52d78158d11a429baf2af52528ba/langchain4j-azure-open-ai/src/main/java/dev/langchain4j/model/azure/AzureOpenAiChatModel.java#L319

Log and Stack trace

This is the response from Azure HTTP client that caused the issue:

HTTP/2 403 
content-type: application/json

{"error":{"code":"403","message": "Access denied due to Virtual Network/Firewall rules."}}

To Reproduce

Provoke an HTTP error code from Azure OpenAi endpoint, for example by configuring the firewall to reject the errors from your IP. Call the @AIService with any prompt, receive a text response with "Status: 403 ..."

@AiService
fun interface LlmIntentRecognitionService {
    @SystemMessage(SYSTEM_MESSAGE)
    fun chat(text: String): String

Expected behavior

An exception is thrown that is specific to the error code, or an IOException that the caller can catch

This is what is done already in OpenAiChatModel here and I wonder if AzureOpenAiChatModel could do the same: https://github.com/langchain4j/langchain4j/blob/9d42a15ca8fb52d78158d11a429baf2af52528ba/langchain4j-open-ai/src/main/java/dev/langchain4j/model/openai/OpenAiChatModel.java#L279

Please complete the following information:

LangChain4j version: 0.34
LLM(s) used: azure-open-ai
Java version: Coretto-21 JDK
Spring Boot version (if applicable): 3.3.4

Additional context

langchain4j-github-bot[bot] commented 1 month ago

/cc @agoncal (azure), @jdubois (azure)

langchain4j commented 1 month ago

@myrosia thanks a lot for reporting!

@qiaoleiatms, since you are fixing this issue for streaming model, could you please add similar fix for non-streaming model? 🙏

We also need to update ChatModelListenerIT.should_listen_error and assert that exception is actually thrown by model.generate(userMessage);

langchain4j commented 1 month ago

@myrosia what behavior would you expect when content filter (e.g. violence, self harm, etc) is triggered on user input?

Specific type of exception is thrown (pros: you can catch it and handle it the way you wish. cons: many developers won't catch it explicitly, so in most cases it will just fail and end user won't know what happened)
Error message is returned as plain text in AiMessage and FInishReason == CONTENT_FILTER (pros: users can understand that their input has led to the content filter being triggered and they have a chance to modify their prompt; developer can also check finish reason to catch this situation)
Something else?

cc @jdubois

myrosia commented 1 month ago

Thank you for responding so quickly! My preference would be an exception to be thrown, or if possibly

If an AiMessage is being returned, then return the AiMessage with finish reason
If anything else is being returned, then return exception

Here's my reasoning. Many examples of langchain4j contain variations of

@AiService
interface myService {
  public String chat(String prompt)
  }

or else some Json-mapped type is being returned.

In either case the reason in AiMessage is unavailable and the developer does not have a meaningful way to access the reason it unless there's an exception. In fact, the most likely outcome now is either a non-sensical String or a much less helpful exception at the attempt to coerce the output into whatever POJO has been specified.

As long as we are supporting this model and giving examples of it, I think an exception is the best option because it's a handleable and specific notification mechanism, nothing else is. In fact, in this model it's highly likely that the developer doesn't know that different finish reasons even exist.

Now, if we are returning AiMessage, the case is different because there's an explicit finish reason and one can document that it should be checked. So in contexts where AiMessage is returned explicitly it makes more sense to keep reasons in the reason field instead of mix and match with exceptions.

langchain4j commented 1 month ago

@myrosia thanks a lot for detailed response!

I agree that exception is a way to go. But I am not sure about AiMessage vs some other return type. The difference in behavior might be confusing.

We also have 2 levels of abstraction in LC4j:

Low-level ChatLanguageModel API that returns LLM response in AiMessage as-is (along with finish reason)
High-level AI Services API that can return whatever user specified

So I am wondering if both should have the same behavior (e.g. exception) or, since low-level API always returns FinishReson (that can be CONTENT_FILTER), it should return FinishReson and high-level API should throw an exception. WDYT?

myrosia commented 1 month ago

Yes, that makes sense to me - high level API with exception, low level API with finish reason, that's consistent.

langchain4j commented 1 month ago

@jdubois @agoncal WDYT?

langchain4j commented 1 month ago

langchain4j / langchain4j

[BUG] Azure OpenAI integration fails to propagate HTTP errors #1875