quarkiverse / quarkus-langchain4j

Quarkus Langchain4j extension
https://docs.quarkiverse.io/quarkus-langchain4j/dev/index.html
Apache License 2.0
148 stars 89 forks source link

Duplicated assistant messages in memory when using anthropic models. #1089

Closed dennysfredericci closed 10 hours ago

dennysfredericci commented 2 days ago

I found duplicated assistant messages in memory when using quarkus-langchain4j-anthropic with websocket.

2024-11-20 12:52:16,765 INFO  [io.qua.lan.ant.QuarkusAnthropicClient$AnthropicClientLogger] (vert.x-eventloop-thread-2) Request:
- method: POST
- url: https://api.anthropic.com/v1/messages
- headers: [Accept: text/event-stream], [anthropic-version: 2023-06-01], [Content-Type: application/json], [User-Agent: Quarkus REST Client], [x-api-key: sk...AA], [content-length: 638]
- body: {
  "model" : "claude-3-5-sonnet-20241022",
  "messages" : [ {
    "role" : "user",
    "content" : [ {
      "type" : "text",
      "text" : "Hi"
    } ]
  }, {
    "role" : "assistant",
    "content" : [ {
      "type" : "text",
      "text" : "Hello! How can I help you today?"
    } ]
  }, {
    "role" : "assistant",
    "content" : [ {
      "type" : "text",
      "text" : "Hello! How can I help you today?"
    } ]
  }, {
    "role" : "user",
    "content" : [ {
      "type" : "text",
      "text" : "I found a bug"
    } ]
  } ],
  "system" : [ ],
  "max_tokens" : 500,

I did a quick troubleshooting, and discovered that the handleMessageStop method in QuarkusAnthropicClient is being called twice:

  1. First time when Anthropic sends the message_stop event
  2. Second time during onCompletion

The handleMessageStop method should only be called once, but I'm unsure about the correct timing for this single invocation.

geoand commented 1 day ago

I think we can guard against it with an AtomicBoolean, that way the timing doesn't matter. Would you be willing to provide such a patch?

Thanks!

dennysfredericci commented 20 hours ago

Yes, but don’t forget to bring a Quarkus t-shirt next time you visit Belgium! 😛

I reproduced the issue using the existing test AnthropicStreamingChatLanguageModelSmokeTest by adding an empty line for the message_stop event. A complete event also contains an empty line.

....
              event: message_delta
                data: {"type":"message_delta","delta":{"stop_reason":"end_turn","stop_sequence":null},"usage":{"output_tokens":41}  }

                event: message_stop
                data: {"type":"message_stop"           }

                """;
....

The handleMessageStop method is now being called twice. I resolved this issue by removing its invocation from onComplete.

Do you have any objections to removing the invocation from onComplete?

I will build a version locally and test it in the application that I found the issue.

geoand commented 10 hours ago

Thanks a lot!

Yes, but don’t forget to bring a Quarkus t-shirt next time you visit Belgium!

I will try, but marketing get's all the t-shirt budget :)