openvinotoolkit / anomalib

An anomaly detection library comprising state-of-the-art algorithms and features such as experiment management, hyper-parameter optimization, and edge inference.
https://anomalib.readthedocs.io/en/latest/
Apache License 2.0
3.81k stars 677 forks source link

[Bug]: torch.onnx.export on loaded PatchcoreModel model throws "IndexError: Dimension out of range (expected to be in range of [-1, 0], but got -2)" error. #2108

Open asnecemnnit opened 5 months ago

asnecemnnit commented 5 months ago

Describe the bug

I am still facing the issue described in #1331 . However, I am directly using torch.onnx.export on loaded PatchcoreModel model. As a result, self.memory_bank is not being initialized which results in this error inside euclidean_dist.

Dataset

Other (please specify in the text field below)

Model

PatchCore

Steps to reproduce the behavior

Execute following code after anomalib installation:

"""Utilities for optimization and OpenVINO conversion."""

# Copyright (C) 2022-2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

import logging
from enum import Enum

import torch
from torch import nn
from torchvision.transforms.v2 import CenterCrop, Compose, Resize, Transform

from anomalib.data.transforms import ExportableCenterCrop

logger = logging.getLogger("anomalib")

class ExportType(str, Enum):
    """Model export type.

    Examples:
        >>> from anomalib.deploy import ExportType
        >>> ExportType.ONNX
        'onnx'
        >>> ExportType.OPENVINO
        'openvino'
        >>> ExportType.TORCH
        'torch'
    """

    ONNX = "onnx"
    OPENVINO = "openvino"
    TORCH = "torch"

class CompressionType(str, Enum):
    """Model compression type when exporting to OpenVINO.

    Examples:
        >>> from anomalib.deploy import CompressionType
        >>> CompressionType.INT8_PTQ
        'int8_ptq'
    """

    FP16 = "fp16"
    """
    Weight compression (FP16)
    All weights are converted to FP16.
    """
    INT8 = "int8"
    """
    Weight compression (INT8)
    All weights are quantized to INT8, but are dequantized to floating point before inference.
    """
    INT8_PTQ = "int8_ptq"
    """
    Full integer post-training quantization (INT8)
    All weights and operations are quantized to INT8. Inference is done in INT8 precision.
    """

class InferenceModel(nn.Module):
    """Inference model for export.

    The InferenceModel is used to wrap the model and transform for exporting to torch and ONNX/OpenVINO.

    Args:
        model (nn.Module): Model to export.
        transform (Transform): Input transform for the model.
        disable_antialias (bool, optional): Disable antialiasing in the Resize transforms of the given transform. This
            is needed for ONNX/OpenVINO export, as antialiasing is not supported in the ONNX opset.
    """

    def __init__(self, model: nn.Module, transform: Transform, disable_antialias: bool = False) -> None:
        super().__init__()
        self.model = model
        self.transform = transform
        self.convert_center_crop()
        if disable_antialias:
            self.disable_antialias()

    def forward(self, batch: torch.Tensor) -> torch.Tensor | tuple[torch.Tensor, torch.Tensor]:
        """Transform the input batch and pass it through the model."""
        batch = self.transform(batch)
        return self.model(batch)

    def disable_antialias(self) -> None:
        """Disable antialiasing in the Resize transforms of the given transform.

        This is needed for ONNX/OpenVINO export, as antialiasing is not supported in the ONNX opset.
        """
        if isinstance(self.transform, Resize):
            self.transform.antialias = False
        if isinstance(self.transform, Compose):
            for transform in self.transform.transforms:
                if isinstance(transform, Resize):
                    transform.antialias = False

    def convert_center_crop(self) -> None:
        """Convert CenterCrop to ExportableCenterCrop for ONNX export.

        The original CenterCrop transform is not supported in ONNX export. This method replaces the CenterCrop to
        ExportableCenterCrop, which is supported in ONNX export. For more details, see the implementation of
        ExportableCenterCrop.
        """
        if isinstance(self.transform, CenterCrop):
            self.transform = ExportableCenterCrop(size=self.transform.size)
        elif isinstance(self.transform, Compose):
            transforms = self.transform.transforms
            for index in range(len(transforms)):
                if isinstance(transforms[index], CenterCrop):
                    transforms[index] = ExportableCenterCrop(size=transforms[index].size)

# Replace with the appropriate input size for your model
dummy_input = torch.randn(1, 3, 256, 256)  # Example for an image classification model

from pathlib import Path
from anomalib.models import EfficientAd, Padim, Patchcore, UnknownModelError, get_model
model = get_model("Patchcore", backbone="wide_resnet50_2")

# Ensure the model is in evaluation mode
model.eval()

onnx_path = "model.onnx"
torch.onnx.export(model=model, args=dummy_input, f=onnx_path,
                  input_names=['input'], output_names=['output'])

print(f"Model has been exported to {onnx_path}")

OS information

OS information:

Expected behavior

Model should be exported in ONNX format.

Screenshots

No response

Pip/GitHub

pip

What version/branch did you use?

1.1.0

Configuration YAML

N/A

Logs

Exception has occurred: IndexError       (note: full exception trace is shown but execution is paused at: _run_module_as_main)
Dimension out of range (expected to be in range of [-1, 0], but got -2)

Code of Conduct

blaz-r commented 5 months ago

Could you share more of the error trace please. I'm not sure if this is Onnx related but this issue might happens if your input is not 4D. I think it should be [B, C, H, W], so even if it's just a single image it should be [1, 3, H, W]. Or it might be related to anomaly maps that should be [1, H, W] not [H, W] if I recall correctly.

asnecemnnit commented 5 months ago

@blaz-r This issue can be reproduced by running the code snippet I have provided above. If you look at the code, I am passing a dummy input to the export method in (B, C, H, W) format: dummy_input = torch.randn(1, 3, 256, 256)

blaz-r commented 5 months ago

Ah okay, I thought that this was the export code for a trained model. Upon closer inspection I now see where the problem is. Patchcore relies on the memory bank that is built during training. In the code above, the model is not trained at all and with this memory bank is empty, leading to issues.

I believe you can export the model directly with your code, but for it to work you'll need to train the model beforehand. Refer to training guide on how to do that.

asnecemnnit commented 5 months ago

Thanks for the quick replies! I have a trained "model.pt" as well which after loading looks like following {'model': InferenceModel( (model): PatchcoreModel() ..... }

Can you suggest me the correct way to load it? Because when I export it, I get this error instead: error

blaz-r commented 5 months ago

I am not sure what the issue could be here, but I'd recommend you export the model with anomalib directly: anomalib export --model Patchcore --export_mode onnx --ckpt_path <PATH_TO_CHECKPOINT> --input_size "[256,256]"

For more help about exporting you can use anomalib CLI like this: anomalib export -h

I do think we'll need to expand the docs on this one though. @ashwinvaidya17 am I missing something or is there no docs on this currently?

asnecemnnit commented 5 months ago

I tried anomalib export -h after full installation of anomalib, but it couldn't recognize the export command. Moreover I tried to use export method in anomalib.engine, but it fails to load the model checkpoint due to some version conflict.

blaz-r commented 5 months ago

That is unusual, the anomalib export -h works fine on my side. I'm not sure how the engine export works directly though, but judging by the docstring it's mostly intended to be used with CLI, although it looks like it could work from code as well.