Closed shrinath-suresh closed 2 years ago
Hi @shrinath-suresh,
In the case of torchvision, you need to use the former INTERP_TRIANGULAR
interpolation type that can be achieved as INTERP_LINEAR
with antialias=True
, as torchvision enables antialiasing by default when using linear interpolation.
On top of that, another source of discrepancy is JPEG decoding. There's no JPEG decoding standard - in general, the better the PSNR for encode-decode, the better, so different decoders employ different tricks to improve the result - sometimes optimized for images from a specific field. In the case of nvJPEG (DALI uses under the hood for the GPU acceleration of the decoding process) conversation from YUV to RGB uses a different interpolation strategy than libjpeg-turbo used for the CPU decoding (botch in torchvision and DALI). So the mentioned pipeline should yield better results:
@pipeline_def(batch_size=1, num_threads=1, device_id=0)
def dali_pipeline(batch_tensor):
jpegs = dali.fn.external_source(source=[batch_tensor], dtype=types.UINT8)
jpegs = dali.fn.decoders.image(jpegs, device="cpu")
jpegs = dali.fn.resize(jpegs, size=[256], subpixel_scale=False, interp_type=types.DALIInterpType.INTERP_LINEAR, antialias=True, mode="not_smaller")
normalized = dali.fn.crop_mirror_normalize(
jpegs,
crop_pos_x=0.5,
crop_pos_y=0.5,
crop=(224, 224),
mean=[0.485 * 255,0.456 * 255,0.406 * 255],
std=[0.229 * 255,0.224 * 255,0.225 * 255],
)
return normalized
You can check this standalone example for reference Resize_example.zip.
Still, it is more or less expected that if the inference is run with different data processing pipeline (which is not bit-exact) than the network was trained with, the results will be slightly different).
While comparing the tensor output from torch transforms and dali transforms, we are observing difference in output.
Attaching the notebook with full reproducible example - Dali preprocessing repro.zip
Or below steps can be followed
Download the sample image (kitten.jpg) from here
Load the image as bytes (torchserve uses bytearray as input) . Hence we want to keep it this way
preprocess the image with torch transforms
Define Dali pipeline
Preprocess using dali pipeline
Tensor output from torch transform
Tensor output from dali tranform
Both tensors are not equal , as the below code return False
We observe a difference of 0.07 delta between these tensors, as the below code returns true
Passing the preprocessed output to the resnet pretrainned model predicts the same class in both torch and dali. However, the probabilities differ.
Output using torch tensor
output using dali
We would like to know if the behaviour is expected ? Or is there any way to fix the preprocessed tensor output (to be same between torch transform and dali).
Note: i have already went through the existing closed issue - https://github.com/NVIDIA/DALI/issues/3610 and updated the pipeline accordingly.
Tested with Pytorch 1.12.1, Cuda 11.3, and dali 1.16.1.