Closed pamelafox closed 1 month ago
/evaluate
Starting evaluation! Check the Actions tab for progress, or wait for a comment with the results.
metric | stat | baseline | pr129 |
---|---|---|---|
gpt_groundedness | mean_rating | 5.0 | 5.0 |
↑ | pass_rate | 1.0 | 1.0 |
gpt_relevance | mean_rating | 5.0 | 5.0 |
↑ | pass_rate | 1.0 | 1.0 |
answer_length | mean | 1017.6 | 1013.8 |
latency | mean | 2.56 | 2.46 |
citations_matched | mean | 0.73 | 0.73 |
Purpose
We don't have permission to do that in our org.
Does this introduce a breaking change?
When developers merge from main and run the server, azd up, or azd deploy, will this produce an error? If you're not sure, try it out on an old environment.
Type of change
Code quality checklist
See CONTRIBUTING.md for more details.
N/A