[Task]: Grayscale / Singlechannel Image Support

For feature similarity based methods like Patchcore or Padim this might work by changing the pre-trained models architecture. There are different ways to build a one-channel model from a three-channel model like aggerating the weights in the first convolution:

import torch
import timm

model = timm.create_model('resnet50', pretrained=True)
conv1_agg_weight = model.conv1.weight.sum(dim=1, keepdim=True)
model.conv1 = torch.nn.Conv2d(1,64, kernel_size=(7,7),stride=(2,2),padding=(3,3),bias=False)
model.conv1.weight.data = conv1_agg_weight

another way would be to just use one of the three input layers:

model = timm.create_model('resnet50', pretrained=True)
conv1_weight_channel = model.conv1.weight[:, 0, :, :] #first channel conv weight
model.conv1 = torch.nn.Conv2d(1,64, kernel_size=(7,7),stride=(2,2),padding=(3,3),bias=False)
model.conv1.weight.data = conv1_weight_channel

For reconstruction based methods it's a bit different hoewever. For some you might only have to change the Autoencoder architecture to accept one channel, for others like EfficientAD you would also have to alter the according pre-trained models architecture

openvinotoolkit / anomalib

[Task]: Grayscale / Singlechannel Image Support #2175

What is the motivation for this task?

Describe the solution you'd like

Additional context