Offline bulk statistic method does not raise error when labels of feature dataframe and segmentation mask don't match

JuliaKukulies commented 1 month ago

The offline bulk statistic method (tobac.utils.get_statistics_from_mask()) should raise an error when the labels in the feature dataframe and segmentation mask do not match, because that way, statistics cannot be calculated and just result in an empty column.

Our current implementation of testing whether any of the feature values exist in the segmentation mask does not really work and should be replaced with something like np.isin(features_t.feature, np.unique(segmentation_mask_t)).sum().

Current code:

    # make sure that the labels in the segmentation mask exist in feature dataframe
    if (
        np.intersect1d(np.unique(segmentation_mask_t), features_t.feature).size
        > np.unique(segmentation_mask_t).size
    ):
        raise ValueError(
            "The labels of the segmentation mask and the feature dataframe do not seem to match. Please make sure you provide the correct input feature dataframe to calculate the bulk statistics. "
        )
    else:

[x] Have you searched the issue tracker for the same problem?
[x] Have you checked if you're using the latest version? If not, which version are you using?
[x] Have you mentioned the steps to reproduce the issue?

JuliaKukulies commented 1 month ago

I will fix that along with #440

w-k-jones commented 3 weeks ago

Resolved by #448

tobac-project / tobac

Offline bulk statistic method does not raise error when labels of feature dataframe and segmentation mask don't match #446