python-pillow / Pillow

Python Imaging Library (Fork)
https://python-pillow.org
Other
12.31k stars 2.23k forks source link

Skip img.convert(img.mode) #3266

Closed elbaro closed 6 years ago

elbaro commented 6 years ago
from PIL import Image
import time
img = Image.open('large.png').convert('RGB')
t = time.time()
for i in range(1000):
    img = img.convert('RGB')
t2 = time.time()
print(t2-t)

It would be nice to check if img.mode==new_mode internally and skip if possible.

What versions of Pillow and Python are you using?

py3.6 5.1.1.post0 (pillow-simd)

radarhere commented 6 years ago

This line - https://github.com/python-pillow/Pillow/blob/master/src/PIL/Image.py#L900 - does already skip most of the convert function if the modes are the same. However, load and copy operations are still performed.

While I agree with the thinking that it would be nice for your example script to have a near zero run time, the reason that it can't is that when convert does change modes, a new image instance is returned.

from PIL import Image

im = Image.new('RGB', (100, 100))
im2 = im.convert('RGBA')

im2 is not im. So when the mode is the same, a new image instance should also be returned - so we do need to copy, taking time. If we did not, the original image would be returned - users could assume that convert always returns a new image, and start applying operations to it, only to discover that it was the original image instance they had altered instead.

elbaro commented 6 years ago

Thanks for the explanation.

Lazy copy may solve the problem but I don't know how hard it is to implement.

How about convert('RGB', inplace=True)? Pillow can take advantage of this explicit hint and avoid the copy in 'RGBA'->'RGB' as well as 'RGB'->'RGB'.

radarhere commented 6 years ago

Is there a specific problem you are trying to solve, or circumstance you are looking to make simpler? So far, I'm not seeing why it wouldn't be easier for end users to check themselves if the mode is the same, rather than navigating a new change to the convert method.

elbaro commented 6 years ago

This is about preventing mistakes. I often see people make a mistake converting a RGB image to RGB. In data science, where a disk I/O is a frequent bottleneck, I experienced a few times 10~30% performance gain just by catching this mistake. (especially if you convert 4k images before downscale)

But now I think the implementation may be not worth the effort.