cognitive-catalyst / WA-Testing-Tool

Scripts that run against Watson Assistant for K fold validation on training set, testing on blind test, and draw precision curves for comparison.
Apache License 2.0
78 stars 60 forks source link

Compare current and previous blind test run #209

Closed andrewrfreed closed 2 years ago

andrewrfreed commented 2 years ago

There are several useful potential comparisons between two blind test runs, as indicated in the blind-out.csv files:

It would be useful to optionally generate these metrics, if a "previous run" is supplied. Ideally, this can be invoked as a standalone process that receives three inputs: