zqhang / AnomalyCLIP

Official implementation for AnomalyCLIP (ICLR 2024)
MIT License
273 stars 28 forks source link

Thresholding at inference time #17

Open cescomuch opened 5 months ago

cescomuch commented 5 months ago

Hi, it is not very clear to me how to extrapolate a threshold at segmentation level to discriminate whether a pixel is classified as “normal” or “anomaly". The output anomaly map needs to be normalized with something like this?

normalize_anomalymap = (anomalymap- anomalymap.min()) / (anomalymap.max() - anomalymap.min())

Thanks

cescomuch commented 5 months ago

I ask for confirmation. My doubt comes from the fact that the probabilities within the anomaly map are not in the range [0,1]. It may be because the maps obtained in the various levels are summed with

anomaly_map = anomaly_map.sum(dim = 0)

zqhang commented 5 months ago

Maybe you could set the threshold with the maximum value of the original training dataset samples (all normal).

zqhang commented 5 months ago

As for the range of the anomaly map, you can just average the anomaly map, i.e., anomaly_map = anomaly_map.sum(dim=0)/4 (the number of intermediate visual features)

cescomuch commented 4 months ago

As for the range of the anomaly map, you can just average the anomaly map, i.e., anomaly_map = anomaly_map.sum(dim=0)/4 (the number of intermediate visual features)

This sounds reasonable to me.

cescomuch commented 4 months ago

Maybe you could set the threshold with the maximum value of the original training dataset samples (all normal).

This, on the other hand, does not seem correct. If, for example, even one prediction related to 'normal' image samples is incorrect, the threshold cannot be correct.

An approach that seems more reasonable is to test multiple thresholds (e.g., 0.4, 0.45, 0.5, 0.55, 0.6) since we are talking about probabilities in the range [0,1], and choose that one that maximizes one measure (e.g., accuracy).