zqhang / AnomalyCLIP

Official implementation for AnomalyCLIP (ICLR 2024)
192 stars 21 forks source link

Thresholding at inference time #17

Open cescomuch opened 1 month ago

cescomuch commented 1 month ago

Hi, it is not very clear to me how to extrapolate a threshold at segmentation level to discriminate whether a pixel is classified as “normal” or “anomaly". The output anomaly map needs to be normalized with something like this?

normalize_anomalymap = (anomalymap- anomalymap.min()) / (anomalymap.max() - anomalymap.min())

Thanks

cescomuch commented 1 month ago

I ask for confirmation. My doubt comes from the fact that the probabilities within the anomaly map are not in the range [0,1]. It may be because the maps obtained in the various levels are summed with

anomaly_map = anomaly_map.sum(dim = 0)

zqhang commented 1 month ago

Maybe you could set the threshold with the maximum value of the original training dataset samples (all normal).

zqhang commented 1 month ago

As for the range of the anomaly map, you can just average the anomaly map, i.e., anomaly_map = anomaly_map.sum(dim=0)/4 (the number of intermediate visual features)

cescomuch commented 3 weeks ago

As for the range of the anomaly map, you can just average the anomaly map, i.e., anomaly_map = anomaly_map.sum(dim=0)/4 (the number of intermediate visual features)

This sounds reasonable to me.

cescomuch commented 3 weeks ago

Maybe you could set the threshold with the maximum value of the original training dataset samples (all normal).

This, on the other hand, does not seem correct. If, for example, even one prediction related to 'normal' image samples is incorrect, the threshold cannot be correct.

An approach that seems more reasonable is to test multiple thresholds (e.g., 0.4, 0.45, 0.5, 0.55, 0.6) since we are talking about probabilities in the range [0,1], and choose that one that maximizes one measure (e.g., accuracy).