Closed arseniymerkulov closed 7 months ago
What size in the input image? If it's not square, you'll also need a centered crop. Otherwise the Resize is going to result in uneven sides (one side will be 224 but the other won't) and break.
pipeline.add_pre_processing(
[
Resize(256, layout='CHW'),
Transpose([1, 2, 0]), # CHW to HWC. We can look at adding ChannelsFirstToChannelsLast to simplify
CenterCrop(224, 224), # must be HWC currently. We can look at adding CHW support.
ChannelsLastToChannelsFirst(),
ImageBytesToFloat(),
Normalize([(0.5, 0.5), (0.5, 0.5), (0.5, 0.5)]),
Unsqueeze([0]),
]
Would also be good to update the original model to opset 17 for consistency. Do that prior to adding the pre-processing.
python -m onnxruntime.tools.update_onnx_opset --opset 17 model.onnx model.opset17.onnx
Yes, input images can be rectangles, thank you. This set of preprocessing steps results in downfall of accuracy from 0.97 to 0.85 compared to torch transforms:
transforms.Compose([
transforms.Resize(size=(224, 224),
interpolation=transforms.InterpolationMode.BILINEAR,
max_size=None,
antialias=None),
transforms.ConvertImageDtype(torch.float),
transforms.Normalize(mean=torch.Tensor([0.5000, 0.5000, 0.5000]),
std=torch.Tensor([0.5000, 0.5000, 0.5000]))
])
After i replaced crop step with letterbox, accuracy increased to 0.92:
pipeline.add_pre_processing(
[
Resize((224, 224), policy='not_larger', layout='CHW'),
LetterBox(target_shape=(224, 224), layout='CHW'),
ImageBytesToFloat(),
Normalize([(0.5, 0.5), (0.5, 0.5), (0.5, 0.5)]),
Unsqueeze([0]),
]
)
Accuracy still below original, i guess this is because torch Resize does not keep same aspect ratio for an image. I didnt find ways to achieve that in onnxruntime-extensions preprocessing steps. What is your advice on that? As an extreme case i can change preprocessing steps in finetuning to match preprocessing steps in inference
You need to match the preprocessing that was done when the model was trained. Whether to crop or letterbox depends on the model type.
e.g. for image classification it will tend to crop like here
For something like object detection or OCR you'd letterbox so you're not excluding any of the original image.
Typically in the pre-processing you'd maintain the aspect ratio, which is what the onnxruntime-extensions preprocessing steps support currently. If you randomly stretch it's going to be harder to train/match.
The difference could be antialiasing. Based on this is sounds like the parameter is ignored for PIL images.
The ONNX Resize supports antialiasing from opset 18 on, so you may need to a) update your model to opset 18, and b) use opset 18 when adding the pre-processing.
Thank you for your answer
I am trying to use PrePostProcessor on vit-tiny model in .onnx format with this code:
After saving it looks ok to me in Netron and i checked it with onnx.checker. Original model have fixed input/output shape: [1, 3, 224, 224] -> [1, 28]
When i run model with preprocessing with code below i get an error:
Inference code:
Attaching model in .onnx format with and without preprocessing model.with.preprocessing.onnx.zip model.zip