Open resonating-sirsh opened 3 years ago
Hi @resonating-sirsh,
It'll keep some part of every input image in memory, so there is a limit. It'll also keep several hundred scanlines of the output image (so memuse will scale with image width), and quite a few per-thread caches too (so it'll also scale with concurrency).
I would make a small testcase and watch memory behaviour with VIPS_LEAK
, eg.:
$ VIPS_LEAK=1 vips copy x.png x2.png
memory: high-water mark 4.59 MB
It prints the allocated pixel buffer high water mark. You can try tuning settings for VIPS_CONCURRENCY
and VIPS_DISC_THRESHOLD
.
There's been talk of something to scavenge pixel buffers during evaluation, but it's not been implemented yet.
Thanks @jcupitt - your quick response is greatly appreciated!! I will look into those settings (on pyvips) and see if i can tweak something. Are there any other tricks i can use, even at the cost of speed, to limit memory use. For example, could i create smaller (clipped) partial tiles to disk and then use the joining functionality or would i still face the same memory constraints? It may be the same problem but I just want to figure out if its worth pursuing in theory.
Yes there are lots of things you can tweak. The first thing is to make a representative benchmark.
One intersting finding for the record, using your observation @jcupitt that "it" stores parts of all the individual images in memory, i assumed by "it", we mean that the vips image object that we are compositing over has a reference to parts of these images. Hence - if i save to file and then reload the image with sequential access so as to "forget" images (every 20 images), it seems to work (slowly) for as many images as i have tested...
#test this ....
if i % 20 == 0 and i > 0:
print("saving and reloading image")
image.write_to_file(f"temp{i}.png")
image = pyvips.Image.new_from_file(f"temp{i}.png", access="sequential")
print("image reloaded. Doing more work")
You can try tuning settings for
VIPS_CONCURRENCY
andVIPS_DISC_THRESHOLD
This looks useful! are both available also for python scripts?
Could you please share what units for VIPS_DISC_THRESHOLD
are used (MB or pixels)?
The disc threshold is the size of the uncompressed image, so width height bands * sizeof( element ).
You can put the unit after the number, so eg. "1gb", "10mb", "100kb", etc.
There's a page in the docs covering libvips image opening which gives some background:
I am trying to write large PNGs (its on a Kubernetes node with 8GB RAM).
.composite
My question is about expected behavior and what is the pattern to do this ... should it be possible to composite any number of "small" images streamed (i.e. loaded from S3 into memory one image at a time) and composite them onto a large output image and save to file? Assume these "small" images are much smaller in memory than 8GB.
I had assumed the the write would sequentially write out the PNG without holding the streamed images in memory but i must be mistaken? It is confusing to me right now that my first test that writes a single PNG works but my true test of writing out different images does not.