confident-ai / deepeval

The LLM Evaluation Framework
https://docs.confident-ai.com/
Apache License 2.0
2.96k stars 216 forks source link

bug: Tools used not showing up in test report #987

Open chip-davis opened 2 weeks ago

chip-davis commented 2 weeks ago

Describe the bug I am running a ToolCorrectnessMetric test. I am expecting one tool to be called. Below is my code:

def test_tool_get_syllabus_correctness_for_all_functions():
    config = get_cosmos_data()
    params = {conf["course_name"]: conf for conf in config}
    course = [course for course in params][0]
    params = params[course]

    function_registry = create_function_registry(
        pull_announcements=True,
        pull_groups=True,
        pull_syllabus=True,
        pull_discussions=True,
        pull_custom_files=True,
        pull_course_content=True,
    )

    question = "What does the syllabus say about the final?"
    messages = [{"role": "user", "content": question}]
    expected_tools = ["get_syllabus"]
    ai_response, tools_used, _ = run_conversation_custom_funcs(
        function_registry, params["prompt_template"], messages, question, params, "Chip"
    )

    metric = ToolCorrectnessMetric()
    test_case = LLMTestCase(
        input=question,
        actual_output=ai_response,
        tools_used=tools_used,
        expected_tools=expected_tools,
    )
    print(f"tools_used: {tools_used}")

    assert_test(test_case, metrics=[metric])

The test passes successfully and my one tool call was printed but the tools_used is missing in the report that opens up on confident-ai.com.

✨ You're running DeepEval's latest Tool Correctness Metric! (using None, strict=False, async_mode=True)... Done! (0.00s) tools_used: ['get_syllabus']

To Reproduce Steps to reproduce the behavior:

  1. Execute a test that relies on tool correctness
  2. Pass the test as tools_used within LLMTestCase
  3. Test passes but tool used not in web UI.

Expected behavior The tools used should be populated in the WebUI.

Screenshots image

Desktop (please complete the following information):

Additional context Its entirely possible I am doing something wrong on my end.

penguine-ip commented 2 weeks ago

@chip-davis hey, seems like a problem on our side, will try to fix it by tomorrow, thanks!

chip-davis commented 2 weeks ago

@chip-davis hey, seems like a problem on our side, will try to fix it by tomorrow, thanks!

Sure!

penguine-ip commented 2 weeks ago

Hey @chip-davis it is fixed now! Might need the latest deepeval version tho (v1.1.4)