Open rmcavoy opened 4 years ago
imgaug uses cv2.resize()
, so this is really just the difference between PIL and cv2.
I could imagine minor deviations if one of them casts float values to int and the other rounds them before casting.
Though I just tried a small test script on a few example images and do not see one of the methods performing clearly more accurate when the other one, unless you use area
interpolation. Small deviations on the order of 0.1
are imho expected. As information is removed, the average pixel values will never match perfectly. If such small deviations are already enough to significantly affect a model, I would worry more about that model's robustness than about the underlying resize method. Though there is of course the possibility that one of the methods introduces recurring patterns in the image that are imperceptible to the human eye (similar to JPEG compression artifacts) while the other method does not. Detecting these would need much more than a simple average though.
import imageio
import numpy as np
import imgaug as ia
import PIL.Image
def main():
image_urls = [
"https://upload.wikimedia.org/wikipedia/commons/8/8c/South-western_black_rhinoceros_%28Diceros_bicornis_occidentalis%29_female.jpg",
"https://upload.wikimedia.org/wikipedia/commons/thumb/1/1c/Squirrel_posing.jpg/919px-Squirrel_posing.jpg",
"https://upload.wikimedia.org/wikipedia/commons/2/2e/MC_Drei-Finger-Faultier.jpg",
"https://upload.wikimedia.org/wikipedia/commons/thumb/7/78/Church_Heart_of_the_Andes.jpg/1280px-Church_Heart_of_the_Andes.jpg"
]
images = [imageio.imread(url) for url in image_urls]
for inter in ["linear", "cubic", "area"]:
inter_pil = {
"linear": PIL.Image.BILINEAR,
"cubic": PIL.Image.BICUBIC,
"area": PIL.Image.BOX
}[inter]
for dt in ["uint8"]:
print("-------------")
print(f"{inter} {dt}")
print("-------------")
for image in images:
image = image.astype(np.dtype(dt))
height, width = int(image.shape[0] * 0.5), int(image.shape[1] * 0.5)
image_ia = ia.imresize_single_image(
image, (height, width), interpolation=inter
)
image_pil = np.asarray(
PIL.Image.fromarray(image).resize(
(width, height), resample=inter_pil
)
)
assert image_ia.shape == image_pil.shape
avgs = [np.average(im) for im in [image, image_ia, image_pil]]
diffs = [avgs[0] - avgs[1], avgs[0] - avgs[2]]
print(
"averages orig[%7.3f] pil[%7.3f] ia[%7.3f] "
"| diffs pil[%7.3f] ia[%7.3f]" % (
*avgs,
*diffs
)
)
if __name__ == "__main__":
main()
Output:
-------------
linear uint8
-------------
averages orig[110.143] pil[110.269] ia[110.269] | diffs pil[ -0.126] ia[ -0.127]
averages orig[ 78.866] pil[ 78.808] ia[ 78.931] | diffs pil[ 0.059] ia[ -0.065]
averages orig[110.512] pil[110.637] ia[110.636] | diffs pil[ -0.125] ia[ -0.124]
averages orig[ 98.146] pil[ 98.011] ia[ 98.208] | diffs pil[ 0.134] ia[ -0.062]
-------------
cubic uint8
-------------
averages orig[110.143] pil[110.142] ia[110.147] | diffs pil[ 0.000] ia[ -0.005]
averages orig[ 78.866] pil[ 78.889] ia[ 78.874] | diffs pil[ -0.022] ia[ -0.008]
averages orig[110.512] pil[110.511] ia[110.518] | diffs pil[ 0.001] ia[ -0.006]
averages orig[ 98.146] pil[ 98.196] ia[ 98.155] | diffs pil[ -0.051] ia[ -0.010]
-------------
area uint8
-------------
averages orig[110.143] pil[110.269] ia[110.642] | diffs pil[ -0.126] ia[ -0.499]
averages orig[ 78.866] pil[ 78.866] ia[ 79.361] | diffs pil[ 0.000] ia[ -0.495]
averages orig[110.512] pil[110.637] ia[110.994] | diffs pil[ -0.125] ia[ -0.482]
averages orig[ 98.146] pil[ 98.146] ia[ 98.690] | diffs pil[ -0.000] ia[ -0.544]
Good to know. Thanks!
I am resizing a 1920x1080 image to be 1333x750 pixels using bilinear interpolation. On this simple task, PIL Resize and Imgaug Resize (master) shows very worrying differences.
The result I get back are img, 96.09632989326131 pil, 96.1052009669084 iaa, 95.98408402100524 where the Pil and ImgAug resizing are very clearly different but the PIL one seems to more accurately maintain the average color values of the original.
Its not clear to me why they should have different performance when they both use bilinear interpolation on the same data (I could actually see a difference in performance on a downstream detection task on a model originally trained on the pil resizing). The image used here is the test image "img.png" from https://github.com/zylo117/Yet-Another-EfficientDet-Pytorch/tree/master/test