rspatial / terra

R package for spatial data handling https://rspatial.github.io/terra/reference/terra-package.html
GNU General Public License v3.0
536 stars 89 forks source link

mem_info reports free memory instead of available memory on Linux #1506

Open cedricr opened 4 months ago

cedricr commented 4 months ago

I’ve noticed that mem_info(rast()) was reporting unexpectedly low numbers on my system, even with only the R session open. These figures are consistent with the free column of running free -g in the shell, whereas I would expect to see the number in the available column instead.

As the free memory of a Linux system tends to go to 0 in normal use, it looks like a rather serious problem because terra will probably never process stuff in memory after the system has been in use for a while, as the memory caches are filled.

Ex:

r$> mem_info(rast())

------------------------
Memory (GB) 
------------------------
check threshold : 1 (memmin)
available       : 33.09
allowed (80%)   : 26.48
needed (n=1)    : 0
------------------------
proc in memory  : TRUE
nr chunks       : 1
------------------------

and at the same time in the shell:

~$ free -g
               total        used        free      shared  buff/cache   available
Mem:              62          16          33           9          21          45
Swap:              7           7           0

So in that case, I have 45 GB available, but terra only sees 33.

If I then flush the cache manually (as documented at https://linux-mm.org/Drop_Caches) with

~$ echo 3 | sudo tee /proc/sys/vm/drop_caches
3

free now returns the same value for free and available

~$ free -g
               total        used        free      shared  buff/cache   available
Mem:              62          17          44           9          10          44
Swap:              7           7           0

and so does mem_info:

r$> mem_info(rast())

------------------------
Memory (GB) 
------------------------
check threshold : 1 (memmin)
available       : 44.31
allowed (80%)   : 35.44
needed (n=1)    : 0
------------------------
proc in memory  : TRUE
nr chunks       : 1
------------------------

I initially thought that setting ram = memInfo.freeram + memInfo.bufferram here: https://github.com/rspatial/terra/blob/e27f4e53e94fc829bbbf46472abe52d2b4151cf3/src/ram.cpp#L44-L45 could do the trick, but apparently it’s more complex than that, and according to this commit to the linux kernel, the best way would be to extract the MemAvailable field from /proc/meminfo.


r$> packageVersion("terra")
[1] ‘1.7.71’

r$> terra::gdal(lib="all")
    gdal     proj     geos 
 "3.8.5"  "9.3.1" "3.12.1" 

r$> Sys.info()[c('sysname', 'release')]
                sysname                 release 
                "Linux" "6.8.8-300.fc40.x86_64" 
jflowernet commented 1 month ago

I'm having the same issue (using Linux Mint): terra is doing processing from disk, when it there is easily enough available RAM to do load all the raster into memory. Thanks for showing how to do the cache flush @cedricr

Would be great to have a fix for this.