Open marcobarilari opened 9 months ago
I just tried if on my mac too. And I can confirm that a RIM file with (unzipped) size of 3.5GB (gzipped 875MB) makes LN2_LAYERS use at least 123GB of RAM.
In the code of LN2_LAYERS there are 36 allocations of the 3d matrix (instances of nifti_image*). So I think the numbers check out. To solve this, I think there are three options:
1.) Maybe we can free up the space of interim arrays that are generated when they are no longer needed and before more arrays are allocated? E.g. maybe with something like delete [] nifti_image
.
2.) Maybe we can reuse some of the interim arrays. Which might make the code harder to read :-/
3.) just buy bigger computers ;-)
Any suggestion @ofgulban ?
Hello @marcobarilari ,
Thanks for opening this issue. I have been looking for something like this to come up to justify further memory optimization in LN2_LAYERS
. @layerfMRI I already have something in mind to deflate the memory usage, exploiting the sparsity of the gray matter voxels in the whole brain.
With regards to swap pointers, I think it sounds correct. @marcobarilari can you send me your rim file (if you wish via email), so that I can check the new optimizations directly on your case. I assume that you do not need depend on this optimization, and already have a working solution for yourself, right? It might take a couple of weeks for me to open up some time for this.
Hi,
Thank you very much for your answer. Uploading the file on the cloud and then will send you the link.
LMK if you need more info etc.
Marco
I have added LN2_RIM_BORDERIZE
program to speed up the computation time of LN2_LAYERS
(with 185f5c6dfa7fcb456a3af8e3c2194bca52b8dd43). This is possible because there is already an optimization step in place that goes faster if the rim file consists of "hollowed out" non gray matter voxel labels (attached below). However, note that this does not result in RAM optimization for now.
[Update] I have started working on writing a new program LN3_LAYERS
(in devel
branch). I have implemented a rather sophisticated RAM optimization to deflate the requirements. Currently, equidistant layerification computations are implemented and working. 100 micron isotorpic whole brain dataset (BigBrain) takes around 8 minutes (~25% faster than LN2_LAYERS
on my laptop) to compute and consumes ~15 GB RAM as opposed to hundreds. This is a massive improvement and I am happy with it. Though, I might be able to decrease it a bit more. I will proceed with implementing equivolume computations next.
Wow thats magic. I tested it (without include borders, without equibins, without equivol). And for me the improvement is even higher.
for a 0.3mm whole brain MP2RGE scan it is an almost three fold improvement of time, because of the reduced use of SWAP. with less RAM available (i have 64GB on my MAC M1), I anticipate a speed improvement of more than an order of magnitude. How did you achieve this?
Cool that you already tried :D. Many features / outputs are not implemented yet so probably the requirements will increase a bit. However, the core improvements are that:
LN2_RIM_BORDERIZE
algorithm to figure out and only allocate memory for the gray matter voxels and their immediate neighbors (borders). It turns out often around 20% of the whole brain voxels are cortical gray matter (in a tight whole brain coverage) so by allocating memory only for those deflates the RAM 80%. This number is now printed as "sparsity" measurement.LN2_LAYERS
when a borderized rim file is given (using LN2_RIM_BORDERIZE
on the input rim).Hey there! I missed all the updates on this issue somehow. I apologize for that.
Thank you very much @ofgulban for working on this, it is amazing that you managed to make LAYNII even more efficient.
I tried to test myself LN3_LAYERS
but I did not manage to compile the new function. What I did was:
devel
make all
from within the repo folderHowever, this does not compile LN3_LAYERS
(i.e. I don't see it in the main folder together with the other functions).
Any hint? I am a total noob in c++ or similar languages
Ok never mind I figured it out, I guess make all
did not work cause LN3_LAYERS
is not part of the list of programs to compile. When running make ./src/LN3LAYERS
, then it was compiled correctly.
Anyway, it is great! It took only 2 minutes for a whole brain rim at 0.25mm and no disk hagging compared to 10 minutes and lots of writing/reading which with a full HD it would fail.
In the output, though I am noticing small differences (see the screenshots, 1) is LN3_LAYERS
and 2) is LN2_LAYERS
). Do you know why and if it could be of concern?
The command run was LN3_LAYERS -rim rim123_space-ANAT.nii.gz -nr_layers 7
Thanks again for taking the time for this enhancement :)
Hi @marcobarilari ,
These differences are due to the way LN3_LAYERS detect "border neighbors" automatically (this is apart of the necessary RAM optimization). If you can post your rim file screenshot, I can more confidently tell you whether this is the source of the differences or not. I would not be concerned about this but again this is a very much "work in progress" program right now.
Also, yes indeed LN3_LAYERS is not a part of the main compilation right now. You need to compile it with make LN3_LAYERS
command.
Hi! Thank you for your reply.
Here below new screenshots with the rim as well. LMK if you need more info. The order is LN2_LAYERS, LN3_LAYERS, and rim
Hi there,
I want to report that during "layerificaiton" with
LN2_LAYERS
I had a problem with too much RAM used with the consequence of the system killing the running process.To give you more info:
rim
derived byrecon-all
(freesurfer) on a 0.75 mm iso MP2RAGE, then upsampled to 0.25 mm iso (weight: 875 MB gzipped)linux crunch machine (aka the "labMonster") with 64 GB of ram
2.3.0
My workaround was to increase the
swap
memory to 200 GB and it is working, with ~40 min of CPU time and ~ 120 of memory space (RAM + swap) occupied. As far as I get, on macs and windows this swapping thing is by default so it is possible that the problem might occur on Linux machines or macs/windows with almost full HD.Do you have any suggestions? Does it make sense?
Marco