Closed jmsmkn closed 1 month ago
We could do something similar to what we do with CIRRUS for very large MHA files:
Depending on the builder and its function we might just copy over the raw data directly, after cleaning up the headers. Depends on if we do any data manipulation.
The only change (IIRC) is that any uncompressed data is compressed. I think Paul looked at this before and it turned out SimpleITK was not doing this efficiently, but cannot find the details.
Actually doing the reverse operation of that CIRRUS method seems not too bad to maintain.
A quick experiment, using /usr/bin/time --verbose
to keep track of RSS spikes tells me that just using SimpleITK to read a nifti format doubles the memory to what you'd expect.
Re-reading the created .mha
with the same script shows a memory spike which is around the same size as the uncompressed image.
So likely the ITK ImageIO used is not efficient, or maybe it cannot be efficient.
Nifti we shouldn't care about here. They can be told to go away and do the conversion themselves offline (using this).
This is only about our MHA/MHD -> validated compressed MHA route.
Having a spike of the size of the uncompressed image is okay - we need to check if it can be successfully loaded, but at the moment we max out at only being able to import 4GB MHA files on 8GB instances.
Relevant support request for Fargate swap support https://github.com/aws/containers-roadmap/issues/2061
Having a spike of the size of the uncompressed image is okay - we need to check if it can be successfully loaded, but at the moment we max out at only being able to import 4GB MHA files on 8GB instances.
That makes sense. We need to load it at least once! Is the source of the double memory peak clear?
Doing MHA -> MHA only seems to use the uncompressed-image size as max RSS. When using an uncompressed 5.8 GiB image in a MHA -> MHA I only see ~5.8 GiB RSS max locally without compression and ~6.9 GiB when compression is turned on.
Extrapolating this down to 4GB MHA files we would use ~6GB for writing. Would that sufficiently explain failure on an 8GB instance, that is... taking (OS) system RAM usage into account?
I made this issue as the worker choked on a 4.8GB uncompressed MHA using an 8GB instance. OS stuff we do not need to worry about as this is running on Fargate. The celery worker isn't heavy enough to explain the difference I think - it idles at 5% of memory (400MB).
Strange. Taking an expected 19% compression overhead on a 4.8GB image + 400MB would still only push things to 6.1GB, well short of the 8GB. Maybe it is a difficult file to compress and the overhead is somehow different.
Then again, why else would it choke, so it's likely the compression that is the culprit here. Chunking that could help.
Problem is within GC, continued in https://github.com/DIAGNijmegen/rse-grand-challenge-admin/issues/310
Is there a way to stream the re-writing of the image data in mha conversion?