Possible bug in changepoint scoring

amith-ananthram commented 1 year ago

When running the changepoint scorer against the example files, I'm getting an empty scores_by_class.tab in the output directory. After digging into the code a bit, I think there might be a bug - there's an #append call to a pandas df (which, despite its name, is not an in-place operation). I think perhaps the expectation of the code at that point is it's operating on a normal list. When I make the change below, things seem to be working. Just wanted to draw your attention to it. Thanks!

To reproduce:

CCU_scoring score-cd \
-s test/pass_submissions/pass_submissions_LDC_reference_sample/CD/CCU_P1_TA1_CD_NIST_mini-eval1_20220531_050236 \
-ref test/reference/LDC_reference_sample \
-i test/reference/LDC_reference_sample/index_files/LC1-SimulatedMiniEvalP1.20220909.CD.scoring.index.tab

The bug (in https://github.com/usnistgov/ccu_validation_scoring/blob/master/CCU_validation_scoring/score_changepoint.py#L263):

    for idx, act in enumerate(alist):
        for iout in delta_cp_thresholds[act]:
            scores[iout].append([act, apScores[idx][iout][0], apScores[idx][iout][1], apScores[idx][iout][2], apScores[idx][iout][3]])

One possible fix (here: https://github.com/usnistgov/ccu_validation_scoring/blob/master/CCU_validation_scoring/score_changepoint.py#L220):

diff --git a/CCU_validation_scoring/score_changepoint.py b/CCU_validation_scoring/score_changepoint.py
index d8227ca..adab9fb 100644
--- a/CCU_validation_scoring/score_changepoint.py
+++ b/CCU_validation_scoring/score_changepoint.py
@@ -216,8 +216,9 @@ def compute_multiclass_cp_pr(ref, hyp, delta_cp_text_thresholds = 100, delta_cp_
     """
     # Initialize

-    scores = {}
-    [ scores.setdefault(iout, pd.DataFrame([], columns = ['type', 'ap', 'precision', 'recall', 'llr'])) for iout in delta_cp_text_thresholds + delta_cp_time_thresholds ]
+    from collections import defaultdict
+    scores = defaultdict(list)
+    # [ scores.setdefault(iout, pd.DataFrame([], columns = ['type', 'ap', 'precision', 'recall', 'llr'])) for iout in delta_cp_text_thresholds + delta_cp_time_thresholds ]

     ### Capture the noscores for later use
     ref_noscore = ref.loc[ref.impact_scalar == 'NO_SCORE_REGION']

Thanks!

amith-ananthram commented 1 year ago

here's the diff for the defaultdict solution above:

nist_error_correction.txt

jfiscus commented 1 year ago

Thanks for the bug report. Commit https://github.com/usnistgov/ccu_validation_scoring/commit/0f9757db83e203520a6ccfc25655dfb84e2aa9fd fixes the code.

amith-ananthram commented 1 year ago

Thank you!

usnistgov / ccu_validation_scoring

Possible bug in changepoint scoring #5