Striveworks / valor

Valor is a centralized evaluation store which makes it easy to measure, explore, and rank model performance.
https://striveworks.github.io/valor/
Other
38 stars 4 forks source link

BUG: Obj Det confusion matrix OOM #792

Closed czaloom closed 1 month ago

czaloom commented 1 month ago

valor version checks

Reproducible Example

def test_rsb_confusion_matrix():
    dets = generate_random_detections(1000, 30, 'abcde')
    loader = DataLoader()
    loader.add_bounding_boxes(dets)
    evaluator = loader.finalize()
    metrics = evaluator.evaluate(
        iou_thresholds=[0.25, 0.75],
        score_thresholds=[0.5],
        metrics_to_return=[MetricType.ConfusionMatrix],
    )

Issue Description

@rsbowman-striveworks has pointed out that confusion matrices are likely to OOM with medium sized datasets.

This is due to the dataset-dependent construction of extremely large intermediates in the following manner:

    mask_groundtruths_with_passing_ious = (
        (
>           groundtruths.reshape(-1, 1, 2)
            == groundtruths_passing_ious.reshape(1, -1, 2)
        )
        .all(axis=2)
        .any(axis=1)
    )

This results in OOM errors since unreasonably sized arrays are initialized.

E           numpy.core._exceptions._ArrayMemoryError: Unable to allocate 35.3 GiB for an array with shape (948452, 20005, 2) and data type bool

Expected Behavior

The confusion matrix computation should be able to complete for large datasets, even if that means a tradeoff in computation time.