langchain4j / langchain4j

Java version of LangChain
https://docs.langchain4j.dev
Apache License 2.0
3.85k stars 729 forks source link

[BUG] - Tools - Sometimes I get JsonSyntaxException because of bad formatted Tool arguments #1339

Closed cvalorereply closed 1 day ago

cvalorereply commented 1 week ago

Describe the bug

Sometimes due to a tool execution call I get a JsonSyntaxException while parsing the arguments of the ToolExecutionRequest toolExecutionRequest from the aiMessage.content(). Due to that, the tool never has an answer and consequently all the successive calls in that memory fails.

I have a chatbot which is empowered with some tools.

@Tool(name="some_task", value="when asked some_task do some task") String doSomeTask(String storyId) { ... }

@Tool(name="commit-jira", value="this is a tool used to update on jira a Story. The input are: 1- the story ID of the story to update 2- the other parameter ... 3- the body of the update call, that is what will be writtern on the story 4- a dryRun boolean parameter, default true") String commit(String storyId, String param, String response, boolena dryRun) { ... }

The first tool performs some task, the second one commit something on Jira given an id and what to commit.

Log and Stack trace stack

To Reproduce

I asked my chatbot directly like: "Can you do - some task - on story 1234". In his plan he executes the first tool and then he uses the answer as "response" value for the second tool, which is ok for me, but I see (in the debugger) that it knows exactly the value to pass on the various parameters, but it fills the arguments of the second toolExecutionRequest such as:

{ "storyId": "1234", "dryRun": true, "another_parameter": "another_value", "response": "response_ofLLM" ---> , <---_ }

instead of

{ "storyId": "1234", "dryRun": true, "another_parameter": "another_value", "response": "response_of_LLM" }

that comma causes the exception to be thrown.

Expected behavior

The parsing of parameters is done correctly or there's a way to provide a custom Tool Executor (?) or there's a retry or, at least, the successively calls does not fails (thus answering anyways with an error message to the tool execution message)

Current behaviour Exception is thrown in the DefaultAIServices while mapping arguments of tool execution, resulting in no answer and failing of successive calls in that memory since that tool message being unanswered.

Please complete the following information:

LangChain4j version: 0.31.0 LLM(s) used: Azure OpenAI (with gpt-3.5-turbo) Java version: 17 Spring Boot version (if applicable): 3.1.6

langchain4j commented 1 week ago

@cvalorereply thanks a lot for reporting! I did not encounter this problem with tools yet.

The parsing of parameters is done correctly

I guess that is the best solution for this exact case, we could configure lenient mode

there's a way to provide a custom Tool Executor (?)

That is another good idea

there's a retry

Same as above

the successively calls does not fails (thus answering anyways with an error message to the tool execution message)

Could you please elaborate?

cvalorereply commented 1 week ago

@langchain4j thanks for your time!

I guess that is the best solution for this exact case, we could configure lenient mode

I think lenient mode could be great solution, I did not encounter yet anything too "strange", just a comma at the end like it would be present another parameter but it's not (which many JSON formatter can understand and remove it automatically when formatting, such as notepad++), that unfortunately breaks GSON parsing.

.

The idea of having a Custom Tool Executor was to having the possibility such a excecutor to have custom parsing and handling our way any kind of json parsing

.

the successively calls does not fails (thus answering anyways with an error message to the tool execution message)

Could you please elaborate?

Sure, another pain point in this bug happening is that exception is thrown while elaborating tool request. image (The method which fails is argumentsAsMap) Being this outside the catch, unhandled exception is thrown. image This causes a ToolExecutionRequest message has been inserted into memory, but no ToolExecutionResponse will never be inserted. The next message sent to the chatbot, all the memory will be sent to OpenAI, but, being present a ToolExecutionRequest without a response, the answer will be (without throwing exception) 400 bad request error image response.content is Status code 400, "{ "error": { "message": "An assistant message with 'tool_calls' must be followed by tool messages responding to each 'tool_call_id'. The following tool_call_ids did not have response messages: call_oorZO7WMsrqdhpIMIdfADe72", "type": "invalid_request_error", "param": "messages.[12].role", "code": null } } " My idea was then to handle somehow the exception and put a ToolExecutionResponse anyhow (even with saying just error) in such a way that, at least, the other calls to the chatbot for that memory are not broken

langchain4j commented 1 week ago

@cvalorereply oh, I see, good point!

cvalorereply commented 1 week ago

@langchain4j don't know when someone will be able to handle it, but investigating more it appears that fromJson method of GSON always work on lenient by default, but it handles ";" (semi-colon) while with "," the JsonReader underlying the Map type adapter always expects a key after a "," (comma)

One fast (but arbitrarily ugly) solution could also be:

    /**
     * Convert arguments to map.
     * @param arguments json string
     * @return map
     */
    public static Map<String, Object> argumentsAsMap(String arguments) {
        return GSON.fromJson(removeTrailingComma(arguments), MAP_TYPE);
    }

    /**
     * Removes trailing commas before closing braces or brackets in JSON strings.
     * @param json the JSON string
     * @return the corrected JSON string
     */
    private static String removeTrailingComma(String json) {
        if (json == null || json.isEmpty()) {
            return json;
        }
        Pattern pattern = Pattern.compile(",(\\s*[}\\]])");
        Matcher matcher = pattern.matcher(json);
        return matcher.replaceAll("$1");
    }

this passes these tests: (the first one already present on source code) - the last two are obv not passed by the actual version of source code on main branch -

 @Test
    public void test_argument() {
        ToolExecutionRequest request = ToolExecutionRequest.builder()
                .id("id")
                .name("name")
                .arguments("{\"foo\":\"bar\", \"qux\": 12}")
                .build();

        assertThat(ToolExecutionRequestUtil.argumentsAsMap(request.arguments()))
                .containsEntry("foo", "bar")
                        .containsEntry("qux", 12.0);

        assertThat((String) ToolExecutionRequestUtil.argument(request, "foo"))
                .isEqualTo("bar");
        assertThat((Number) ToolExecutionRequestUtil.argument(request, "qux"))
                .isEqualTo(12.0);
    }

    @Test
    public void test_argument_comma() {
        ToolExecutionRequest request = ToolExecutionRequest.builder()
                                                           .id("id")
                                                           .name("name")
                                                           .arguments("{\"foo\":\"bar\", \"qux\": 12,}")
                                                           .build();

        assertThat(ToolExecutionRequestUtil.argumentsAsMap(request.arguments()))
              .containsEntry("foo", "bar")
              .containsEntry("qux", 12.0);

        assertThat((String) ToolExecutionRequestUtil.argument(request, "foo"))
              .isEqualTo("bar");
        assertThat((Number) ToolExecutionRequestUtil.argument(request, "qux"))
              .isEqualTo(12.0);
    }

    @Test
    public void test_argument_comma_array() {
        ToolExecutionRequest request = ToolExecutionRequest.builder()
                                                           .id("id")
                                                           .name("name")
                                                           .arguments(
                                                                 "{\"key\":[{\"foo\":\"bar\", \"qux\": 12,},{\"foo\":\"bar\", \"qux\": 12,},]}")
                                                           .build();

        assertThat(ToolExecutionRequestUtil.argumentsAsMap(request.arguments()))
              .containsKey("key");

        assertTrue(ToolExecutionRequestUtil.argument(request, "key") instanceof ArrayList);

        List<Object> keys = ToolExecutionRequestUtil.argument(request, "key");

        keys.forEach(key -> {
            assertTrue(key instanceof LinkedTreeMap);
            assertTrue(((LinkedTreeMap<?, ?>) key).containsKey("foo"));
            assertTrue(((LinkedTreeMap<?, ?>) key).containsKey("qux"));
            assertThat(((LinkedTreeMap<?, ?>) key).get("foo")).isEqualTo("bar");
            assertThat(((LinkedTreeMap<?, ?>) key).get("qux")).isEqualTo(12.0);
        });
    }

Anyways, I hope someone will work on that ASAP wrt their possibility and maybe find a better solution.

Thanks for your help, wishing to hear you soon.

cvalorereply commented 1 week ago

moreover I found this

// TODO ensure this method never throws exceptions

image

on DefaultToolExecutor on actual main branch of source code, which I'm afraid it's not really ensured

langchain4j commented 4 days ago

@cvalorereply since you've already investigated the issue and proposed a solution, would you like to open a PR? I think removeTrailingComma can be a good temporary solution untill we fully migrate to Jackson.

cvalorereply commented 4 days ago

@langchain4j done 👍

PR: https://github.com/langchain4j/langchain4j/pull/1352

langchain4j commented 1 day ago

Fixed by https://github.com/langchain4j/langchain4j/pull/1352