NIEHS / chopin

Scalable GIS methods for environmental and climate data analysis
https://niehs.github.io/chopin/
Other
6 stars 2 forks source link

Maximum number of threads per R installation #41

Closed sigmafelix closed 4 months ago

sigmafelix commented 5 months ago

The default build setting of R allows up to 128125 concurrent processes. This makes a problem in running multiple tasks across different nodes in HPC. Running R sessions in containers needs to be tested whether this practice has nothing to do with the maximum thread limit in the local installation of R on HPC.

sigmafelix commented 5 months ago

The maximum possible number of processes were tested at two Apptainer containers. The test case was the crop fraction in 10 kilometers circular buffers at 8+M points in the mainland US. The results were produced without errors and two results were identical. The duration was 15+% longer (~12 minutes). It may imply that too many concurrent accesses to the same file (Crop raster file) negatively affect the performances. However, it should be noted that this is a very rare case as users will parallelize across multiple files.

sigmafelix commented 4 months ago

callr and future.callr will bypass 125 concurrent process limit.