lassoan / SlicerMONAIAuto3DSeg

Extension for 3D Slicer for running MONAI Auto3DSeg models
MIT License
54 stars 8 forks source link

Low-resolution models still too slow #20

Closed lassoan closed 7 months ago

lassoan commented 7 months ago

Tested Low-res abdominal organs model on a laptop (Intel Core i7, 16GB RAM, no GPU).

Good news: inference is fast (126 seconds). Bad news: total segmentation is still slow (582 seconds)

Details:

Preds time is painfully long. Especiall slow was preds torch.Size([1, 25, 160, 133, 134]), but also preds inverted torch.Size([25, 512, 512, 321]) was quite slow. Maybe because they both used up all the 16GB RAM and CPU usage was below 10%.

Convert to segmentation (>=0.5 or torch.argmax) also seems longer than it should be.

lassoan commented 7 months ago

We should avoid storing the segmentation result at full-resolution as float for each segment, because for example storing 512x512x321 float64 values for 25 segments would consume 16GB. If we stored the values in int8 instead of float64 then it would only consume 2GB which should easily fit into the RAM and may make the processing 1-2 magnitudes faster (5 seconds instead of 5 minutes).

It could also help if not all the segments are processed at once, as that can also temporarily increase memory usage beyond phisically available RAM. Or we could split larger images into let's say 5 vertical sections and process them one by one, as that would also reduce memory usage by a factor of 5.

Maybe check how TotalSegmentator can perform these postprocessing steps so much more quickly.

lassoan commented 7 months ago

I've tried the low-res models on a computer with 32GB RAM and a GPU and total time needed was less than half minute, so this is pretty amazing. But it used 32GB RAM. We really need to bring down the memory usage so that at the minimum it can run fast on a computer with 16GB RAM, but if we want most users to be able to run it then we cannot require more than 8GB RAM.

On the same laptop, TotalSegmentator v2 execution takes 133 seconds in total, for all 5 groups of segments. It uses about 70% of CPU during the entire process. Auto3DSeg's CPU usage does not go above 25-30%, during preds it is about 5% (and during preds virtual memory reserved for the process goes up to 17GB).

diazandr3s commented 7 months ago

Many thanks for taking the time to test these models, @lassoan.

Here I've changed the order in which we invert the pre-processing transforms and do the argmax function: https://github.com/lassoan/SlicerMONAIAuto3DSeg/pull/21/files

Please test this and let us know.

lassoan commented 7 months ago

Wow! Memory usage is down from 17GB to about 3GB, total time down from 582 seconds to 87 seconds.

Process data: 2.4->2.3 seconds Inference: 126.1 -> 63.3 seconds Logits: 2.9->1.3 seconds Preds: 339.5->7.0 seconds NIFTI save: 6.6->1.9 seconds

This is ready to be released!

rbumm commented 7 months ago

That is awesome. Thank you both for the hard work!

Am Do., 22. Feb. 2024 um 20:30 Uhr schrieb Andras Lasso < @.***>:

Closed #20 https://github.com/lassoan/SlicerMONAIAuto3DSeg/issues/20 as completed.

— Reply to this email directly, view it on GitHub https://github.com/lassoan/SlicerMONAIAuto3DSeg/issues/20#event-11895700426, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEKMXPQCHCJDEX5AK5IMHN3YU6MGHAVCNFSM6AAAAABDUAHUMSVHI2DSMVQWIX3LMV45UABCJFZXG5LFIV3GK3TUJZXXI2LGNFRWC5DJN5XDWMJRHA4TKNZQGA2DENQ . You are receiving this because you are subscribed to this thread.Message ID: @.*** com>