Closed czaloom closed 1 month ago
def test_rsb_confusion_matrix(): dets = generate_random_detections(1000, 30, 'abcde') loader = DataLoader() loader.add_bounding_boxes(dets) evaluator = loader.finalize() metrics = evaluator.evaluate( iou_thresholds=[0.25, 0.75], score_thresholds=[0.5], metrics_to_return=[MetricType.ConfusionMatrix], )
@rsbowman-striveworks has pointed out that confusion matrices are likely to OOM with medium sized datasets.
This is due to the dataset-dependent construction of extremely large intermediates in the following manner:
mask_groundtruths_with_passing_ious = ( ( > groundtruths.reshape(-1, 1, 2) == groundtruths_passing_ious.reshape(1, -1, 2) ) .all(axis=2) .any(axis=1) )
This results in OOM errors since unreasonably sized arrays are initialized.
E numpy.core._exceptions._ArrayMemoryError: Unable to allocate 35.3 GiB for an array with shape (948452, 20005, 2) and data type bool
The confusion matrix computation should be able to complete for large datasets, even if that means a tradeoff in computation time.
valor version checks
Reproducible Example
Issue Description
@rsbowman-striveworks has pointed out that confusion matrices are likely to OOM with medium sized datasets.
This is due to the dataset-dependent construction of extremely large intermediates in the following manner:
This results in OOM errors since unreasonably sized arrays are initialized.
Expected Behavior
The confusion matrix computation should be able to complete for large datasets, even if that means a tradeoff in computation time.