openvinotoolkit / anomalib

An anomaly detection library comprising state-of-the-art algorithms and features such as experiment management, hyper-parameter optimization, and edge inference.
https://anomalib.readthedocs.io/en/latest/
Apache License 2.0
3.41k stars 615 forks source link

Why the image_AUROC of ganomaly is changed so much every time #1093

Open FFFox-abc opened 1 year ago

FFFox-abc commented 1 year ago

when i use this command python tools/train.py --model ganomaly --config src/anomalib/models/ganomaly/config.yaml to training, the image_AUROC is changed so much every time. sometimes is 0.51... sometimes 0.23... I noticed that the random seed is set to 42. So, shouldn't the result be the same every time? image

and can you please to explian why the anomalous score is over 100% for both good images and broken images image image

JoaoMBarbosa99 commented 11 months ago

I have the same problem about the score over 100 and wrong labels. Have you found the problem?

D3lt4lph4 commented 10 months ago

Hi,

I was doing some tests on the MVTEC dataset with ganomaly and had similar anomaly scores of more than 100% (sometimes up to 1500-ish scores).

Searching into the code, I found that the Ganomaly model compute some form of normalization based on all the predictions (here). Basically, in the test_step function of the model, the min and max anomaly values of the current batch are compared with previous min and max from other batches and stored. Then, in the test_epoch_end function, the normalization is performed with regards to these min and max.

Now, the problem is that the callback handling visualization is called BEFORE the normalization happens (on_test_batch_end, here), causing the display of un-normalized (and multiplied by 100) values.

The quick fix I tried was to re-use the min and max values from the validation step and to normalize in the test_step function. Then I got values within the 0-100% range.

    def test_step(self, batch: dict[str, str | Tensor], batch_idx: int, *args, **kwargs) -> STEP_OUTPUT:
        """Update min and max scores from the current step."""
        super().test_step(batch, batch_idx)
        logger.info(f"{self.max_scores=}/{self.min_scores=}")
        # self.max_scores = max(self.max_scores, torch.max(batch["pred_scores"]))
        # self.min_scores = min(self.min_scores, torch.min(batch["pred_scores"]))
        batch["pred_scores"] = self._normalize(batch["pred_scores"])
        return batch

    def test_epoch_end(self, outputs: EPOCH_OUTPUT) -> EPOCH_OUTPUT:
        """Normalize outputs based on min/max values."""
        # for prediction in outputs:
        #     prediction["pred_scores"] = self._normalize(prediction["pred_scores"])
        super().test_epoch_end(outputs)
        return outputs

That being said, this is 100% not properly tested, the prediction are still all labeled as anomalies and the network has a very low AUROC score (~0.3). I'm not sure, whether I broken some stuff or not, I'll look more into it tomorrow if I have the time.

FFFox-abc commented 6 months ago

@D3lt4lph4 have you solve this problem?