Closed homm closed 2 months ago
You're essentially using _VALID_WEBP_MODES
in two locations, but you removed that constant. I assume you inlined them when _VALID_WEBP_MODES
and _VALID_WEBP_LEGACY_MODES
were each only used in one place, but it seems like _VALID_WEBP_MODES
could be kept. Also it could just be a Set instead of a Dict with True
values.
@Yay295 I've removed this constants primary since it's mistyping. Now I'm added separate _convert_frame
routine for this.
I expected these changes to improve speed, but I'm not finding a definitive difference when I test. Does that mirror your experience?
Are the improvements here more about theory and tidying up?
I don't know how much of an effect it has, but there is some overhead when using Capsules because they use strings as identifiers. So there's some added string comparisons behind the scenes.
@radarhere
Are the improvements here more about theory and tidying up?
All improvements are listed in the PR description. While elimination of .tobytes()
doesn't affect execution time much, it helps to avid extra image copy, which reduces memory footprint. See "Maximum resident set size" below. Also, ImagingSectionEnter/Leave
in WebPAnimEncoder
unlocks GIL in the current thread, which improves IO in other threads.
from io import BytesIO
from PIL import Image
for i in range(4):
with BytesIO() as f:
Image.new('RGB', (8192, 8192)).save(f, 'webp', method=0, quality=1)
# AMD Cpu
## Main branch
$ /usr/bin/time -v python test.py
Command being timed: "python test.py"
User time (seconds): 19.39
System time (seconds): 1.16
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:20.63
Maximum resident set size (kbytes): 1296972
Major (requiring I/O) page faults: 18
Minor (reclaiming a frame) page faults: 1484790
Voluntary context switches: 19
Involuntary context switches: 719
## webp-capsule branch
$ /usr/bin/time -v python test.py
Command being timed: "python test.py"
User time (seconds): 19.46
System time (seconds): 0.78
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:20.27
Maximum resident set size (kbytes): 1100660
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 1090899
Voluntary context switches: 1
Involuntary context switches: 1180
# Local Docker (M1 Pro)
## Main branch
$ /usr/bin/time -v python test.py
Command being timed: "python test.py"
User time (seconds): 4.51
System time (seconds): 0.35
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:04.89
Maximum resident set size (kbytes): 851160
Major (requiring I/O) page faults: 9
Minor (reclaiming a frame) page faults: 674262
Voluntary context switches: 495
Involuntary context switches: 88
## webp-capsule branch
$ /usr/bin/time -v python test.py
Command being timed: "python test.py"
User time (seconds): 4.18
System time (seconds): 0.24
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:04.44
Maximum resident set size (kbytes): 648432
Major (requiring I/O) page faults: 9
Minor (reclaiming a frame) page faults: 286806
Voluntary context switches: 390
Involuntary context switches: 10
@Yay295
some overhead when using Capsules because they use strings as identifiers
It's literally nothing comparing to the copying megabytes of image data to the byte string. But as you can see above, even the copying doesn't affect encoding time much.
Adoption of #8341 ideas
Improvements
.tobytes("raw", rawmode)
).has_transparency_data
for save and save_all.RGBX
mode without conversionImagingSectionEnter/Leave
forWebPAnimEncoder