tobac-project / tobac

Tracking and object-based analysis of clouds
BSD 3-Clause "New" or "Revised" License
101 stars 54 forks source link

Offline bulk statistic method does not raise error when labels of feature dataframe and segmentation mask don't match #446

Closed JuliaKukulies closed 3 weeks ago

JuliaKukulies commented 1 month ago

The offline bulk statistic method (tobac.utils.get_statistics_from_mask()) should raise an error when the labels in the feature dataframe and segmentation mask do not match, because that way, statistics cannot be calculated and just result in an empty column.

Our current implementation of testing whether any of the feature values exist in the segmentation mask does not really work and should be replaced with something like np.isin(features_t.feature, np.unique(segmentation_mask_t)).sum().

Current code:

    # make sure that the labels in the segmentation mask exist in feature dataframe
    if (
        np.intersect1d(np.unique(segmentation_mask_t), features_t.feature).size
        > np.unique(segmentation_mask_t).size
    ):
        raise ValueError(
            "The labels of the segmentation mask and the feature dataframe do not seem to match. Please make sure you provide the correct input feature dataframe to calculate the bulk statistics. "
        )
    else:
JuliaKukulies commented 1 month ago

I will fix that along with #440

w-k-jones commented 3 weeks ago

Resolved by #448