openvinotoolkit / anomalib

An anomaly detection library comprising state-of-the-art algorithms and features such as experiment management, hyper-parameter optimization, and edge inference.
https://anomalib.readthedocs.io/en/latest/
Apache License 2.0
3.66k stars 652 forks source link

[Bug]: The max_epochs setting for Engine is invalid #1873

Closed xns0318 closed 6 months ago

xns0318 commented 6 months ago

Describe the bug

I set min_epochs,max_epochs is useless, always return Trainer.fit stopped:max epochs=1 0276db428a53f6a67c19057e8b2ee69

Dataset

Folder

Model

PatchCore

Steps to reproduce the behavior

datamodule = Folder( name="qigang", root= r"D:\detection\Anomaly-Detection\dataset\qigang", normal_dir=r"train\good", abnormal_dir=r"test\first", mask_dir=r"ground_truth\first", normal_split_ratio=0.2, seed = 2024 )
datamodule.setup() model = Patchcore().cuda() engine = Engine( accelerator= "cuda",

check_val_every_n_epoch=10,

                devices=1,
                min_epochs = 10,
                max_epochs=100,
                max_steps =-1)

OS information

OS information:

Expected behavior

I expect to be able to complete the training of the epoch I want to set

Screenshots

No response

Pip/GitHub

pip

What version/branch did you use?

No response

Configuration YAML

model:
    class_path: anomalib.models.Patchcore
    init_args:
      backbone: wide_resnet50_2
      pre_trained: true
      layers:
      - layer2
      - layer3
      coreset_sampling_ratio: 0.1
      num_neighbors: 9

  data:
    class_path: anomalib.data.Folder
    init_args:
      name: gangtou
      root: "D:\\detection\\Anomaly-Detection\\dataset\\qigang"
      normal_dir: "train/good"
      abnormal_dir: "test/first"
      normal_test_dir: "test/good"
      mask_dir: "ground_truth/first"
      normal_split_ratio: 0
      extensions: [".png"]
      image_size: [512, 512]
      center_crop: null
      normalization: imagenet
      train_batch_size: 32
      eval_batch_size: 32
      num_workers: 8
      task: SEGMENTATION
      transform_config_train: null
      transform_config_eval: null
      test_split_mode: FROM_DIR
      test_split_ratio: 0.2
      val_split_mode: same_as_test
      val_split_ratio: 0.3
      seed: null

  metrics:
    image:
      - F1Score
      - AUROC
    pixel:
      - F1Score
      - AUROC
    threshold:
      class_path: anomalib.metrics.F1AdaptiveThreshold
      init_args:
        default_value: 0.5

Logs

Trainer.fit stopped:max epochs=1 reached.

Code of Conduct

samet-akcay commented 6 months ago

@xns0318, this is because Patchcore only needs 1 epoch to train to extract the features. Increasing the max_epoch does not improve the Patchcore training. For this reason, this is hardcoded in the code here. https://github.com/openvinotoolkit/anomalib/blob/165702f7f7887b82a5a922fe25582376c785d47e/src/anomalib/models/image/patchcore/lightning_model.py#L120

For more details, you could refer to the paper https://arxiv.org/abs/2106.08265

xns0318 commented 6 months ago

Thank you very much!@samet-akcay