PreibischLab / BigStitcher

ImgLib2/BDV implementation of Stitching for large datasets
GNU General Public License v2.0
67 stars 14 forks source link

RAM usage when fusing HDF5 in "cached" mode #98

Closed dpshepherd closed 3 years ago

dpshepherd commented 3 years ago

Hi all,

We have a multi-terabyte BDV H5/XML consisting of 22 tiles, each tile with four channels. When fusing, we select "Save as new XML project (HDF5)" and the "cached" options. The predicted RAM usage is ~569710 from the fusion GUI panel for cached/H5. The fused image size is 29102 x 19322 x 335 pixels at (1,1,1) resolution. We are not writing any downsamples in this HDF5, as we will convert it to N5 afterward with downsampling.

What we consistently observe for large fusions like this one (cached, new HDF5) is that the RAM use grows over time to consume all of the ram on our server (~1 TB) and triggers significant use of the swap file, which is more than FIJI is allocated (900 GB).

Is there a way to keep the RAM usage from consuming all of the resources during the fusion? We have other users in the group that need to use the server, so we are OK with trading off fusing speed for RAM usage. Should we run a different FIJI instance with a much lower RAM limit?

Server is running Linux Mint 19.3 and most recent BigStitcher and FIJI.

Thanks!

pawlowska commented 3 years ago

Thanks for raising this issue. I was also wondering about RAM during fusion. I only have experience with data processing PCs, 80-100 GB range. Fusion to HDF5 likes RAM and moreover, RAM is not freed after fusion is done; on the PC, I end up restarting Fiji after each fusion. This doesn't happen for HDF5 resaving of raw data.

StephanPreibisch commented 3 years ago

Hi, Java (Fiji) will use as much RAM as it is allowed to use - especially if the dataset it works with is bigger than the RAM available (it will have also multiple copies of the data in RAM - raw, fused, temp). You can limit the amount of RAM it uses to whichever amount you want (Fiji > Edit > Options > Memory & Threads - e.g. set it to 256GB). Caching means it'll use all RAM it has available and drop least recently used data once the RAM is full.

Does that work?

dpshepherd commented 3 years ago

Does that work?

No, that is the source of my question. When fusing with HDF5 and cached, it does not obey the RAM allocated to FIJI through the options. If we set this to 256 GB, it continues using RAM up to 1 TB.

dpshepherd commented 3 years ago

We eventually removed FIJI and reinstalled BigStitcher. BigStitcher still does not respect the memory limit set in Fiji, but now it does stay within 10% of the allocated memory when fusing H5/cached. This is acceptable for our use, so I'll close this.

StephanPreibisch commented 3 years ago

Hi @dpshepherd, it really cannot be that BigStitcher uses more RAM than Fiji allocated, this is set by the Java instance and cannot be circumvented from within that instance. There must be something else wrong ...

StephanPreibisch commented 3 years ago

Which Java version do you use? Maybe that is a problem?

dpshepherd commented 3 years ago
(base) dps@qi2labserver:~$ java -version
openjdk version "11.0.11" 2021-04-20
OpenJDK Runtime Environment (build 11.0.11+9-Ubuntu-0ubuntu2.18.04)
OpenJDK 64-Bit Server VM (build 11.0.11+9-Ubuntu-0ubuntu2.18.04, mixed mode, sharing)