openvinotoolkit / anomalib

An anomaly detection library comprising state-of-the-art algorithms and features such as experiment management, hyper-parameter optimization, and edge inference.
https://anomalib.readthedocs.io/en/latest/
Apache License 2.0
3.79k stars 674 forks source link

[Bug]: Transformations behaviour possibly not expected when predicting with fresh dataloader #2254

Open blaz-r opened 2 months ago

blaz-r commented 2 months ago

Describe the bug

Inside the engine code the trasnforms from the datamodule and dataloader are taken before the ones from the model: https://github.com/openvinotoolkit/anomalib/blob/2bd2842ec33c6eedb351d53cf1a1082069ff69dc/src/anomalib/engine/engine.py#L382-L398

This can cause problems if datamodule was never used inside the trainer. In that case looking at the following code, the transforms returned are just resize (since there is no trainer to take the correct model transforms from). This leads to missing the potential normalization from the model: https://github.com/openvinotoolkit/anomalib/blob/2bd2842ec33c6eedb351d53cf1a1082069ff69dc/src/anomalib/data/base/datamodule.py#L266-L277

This happens only if you are calling setup and dataloader from the datamodule outside trainer (as that is not set in this case) like this:

datamodule= MVTec(..)
datamodule.setup() # this calls self.train_transforms mentioned above
datamodule.test_dataloader() # only resize transofmr here

There is also code below to reproduce this.

A workaround I have at the moment is doing this:

dataloader.dataset.transform = None

This way the dataloader transforms are ignored and the ones from model are taken. Another possibility that I see is by manually setting the transforms of the datamodule with model.configure_transforms(image_size).

Dataset

N/A

Model

N/A

Steps to reproduce the behavior

    #### CONFIGURE THIS ####
    mvtec_path = "../datasets/MVTec"
    #####

    data = MVTec(root=mvtec_path, image_size=(42, 42), category="bottle", num_workers=0)
    data.setup()

    print(data.test_dataloader().dataset.transform)

OS information

OS information:

Expected behavior

I would expect that in this case, the model transforms would take priority over the ones in dataloader, but I see how this would cause trouble in case of custom transforms inside datamodule.

Screenshots

No response

Pip/GitHub

GitHub

What version/branch did you use?

1.2.0dev

Configuration YAML

/

Logs

/

Code of Conduct

samet-akcay commented 2 months ago

@djdameln any thoughts?

blaz-r commented 1 month ago

Is there any updates on this one?

samet-akcay commented 1 month ago

@blaz-r, not to address this one specifically, but @djdameln is working on some changes, which might address this

blaz-r commented 1 month ago

Okay. I'll use the workaround for now and we'll see when those changes are added.