Giskard-AI / giskard

🐢 Open-Source Evaluation & Testing for ML & LLM systems
https://docs.giskard.ai
Apache License 2.0
4.07k stars 267 forks source link

Add conversation correctness evaluation #1946

Closed pierlj closed 1 month ago

linear[bot] commented 5 months ago

GSK-3579 Correctness evaluator does not take into account full conversation

sentry-io[bot] commented 5 months ago

🔍 Existing Issues For Review

Your pull request is modifying functions with the following pre-existing issues:

📄 File: giskard/rag/metrics/correctness.py

Function Unhandled Issue
__call__ LLMGenerationError: Error while evaluating the agent main in <mod...
Event Count: 6

Did you find this useful? React with a 👍 or 👎

alexcombessie commented 1 month ago

Hey @pierlj - Just to close off old PRs, is this still relevant or should we close it?

pierlj commented 1 month ago

Yes still relevant, I thought it was already merged to be honest. I'll take care of it.

sonarcloud[bot] commented 1 month ago

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
100.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarCloud