Closed lassoan closed 7 months ago
We should avoid storing the segmentation result at full-resolution as float for each segment, because for example storing 512x512x321 float64 values for 25 segments would consume 16GB. If we stored the values in int8 instead of float64 then it would only consume 2GB which should easily fit into the RAM and may make the processing 1-2 magnitudes faster (5 seconds instead of 5 minutes).
It could also help if not all the segments are processed at once, as that can also temporarily increase memory usage beyond phisically available RAM. Or we could split larger images into let's say 5 vertical sections and process them one by one, as that would also reduce memory usage by a factor of 5.
Maybe check how TotalSegmentator can perform these postprocessing steps so much more quickly.
I've tried the low-res models on a computer with 32GB RAM and a GPU and total time needed was less than half minute, so this is pretty amazing. But it used 32GB RAM. We really need to bring down the memory usage so that at the minimum it can run fast on a computer with 16GB RAM, but if we want most users to be able to run it then we cannot require more than 8GB RAM.
On the same laptop, TotalSegmentator v2 execution takes 133 seconds in total, for all 5 groups of segments. It uses about 70% of CPU during the entire process. Auto3DSeg's CPU usage does not go above 25-30%, during preds it is about 5% (and during preds virtual memory reserved for the process goes up to 17GB).
Many thanks for taking the time to test these models, @lassoan.
Here I've changed the order in which we invert the pre-processing transforms and do the argmax function: https://github.com/lassoan/SlicerMONAIAuto3DSeg/pull/21/files
Please test this and let us know.
Wow! Memory usage is down from 17GB to about 3GB, total time down from 582 seconds to 87 seconds.
Process data: 2.4->2.3 seconds Inference: 126.1 -> 63.3 seconds Logits: 2.9->1.3 seconds Preds: 339.5->7.0 seconds NIFTI save: 6.6->1.9 seconds
This is ready to be released!
That is awesome. Thank you both for the hard work!
Am Do., 22. Feb. 2024 um 20:30 Uhr schrieb Andras Lasso < @.***>:
Closed #20 https://github.com/lassoan/SlicerMONAIAuto3DSeg/issues/20 as completed.
— Reply to this email directly, view it on GitHub https://github.com/lassoan/SlicerMONAIAuto3DSeg/issues/20#event-11895700426, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEKMXPQCHCJDEX5AK5IMHN3YU6MGHAVCNFSM6AAAAABDUAHUMSVHI2DSMVQWIX3LMV45UABCJFZXG5LFIV3GK3TUJZXXI2LGNFRWC5DJN5XDWMJRHA4TKNZQGA2DENQ . You are receiving this because you are subscribed to this thread.Message ID: @.*** com>
Tested Low-res abdominal organs model on a laptop (Intel Core i7, 16GB RAM, no GPU).
Good news: inference is fast (126 seconds). Bad news: total segmentation is still slow (582 seconds)
Details:
Preds
time is painfully long. Especiall slow waspreds torch.Size([1, 25, 160, 133, 134])
, but alsopreds inverted torch.Size([25, 512, 512, 321])
was quite slow. Maybe because they both used up all the 16GB RAM and CPU usage was below 10%.Convert to segmentation
(>=0.5 or torch.argmax) also seems longer than it should be.