rafaqz / Rasters.jl

Raster manipulation for the Julia language
MIT License
197 stars 34 forks source link

Allow Users to Read Data Exceeding Free Memory #690

Open JoshuaBillson opened 5 days ago

JoshuaBillson commented 5 days ago

I've noticed that recent versions of Rasters.jl no longer allow me to read rasters into memory when their size exceeds the value reported by Sys.free_memory(). This is creating issues for me, because it appears that free_memory does not detect swap space, so it reports that I only have 200 MB available. While I can circumvent this to a certain extent by setting lazy=true, I cannot save the transformed rasters back to disk without throwing a GDALError with the message Pointer 'hDS' is NULL in 'GDALGetRasterCount'. There should be an option to override this behaviour, with the possibility of a warning message when reading data that exceeds the reported memory space.

Edit: On further examination, it appears that Sys.free_memory() is just incorrect. According to my activity monitor, I should have 4GB of memory available, but Sys.free_memory() is reporting 150MB. This is on an M1 MacBook Pro running Julia 1.10.4 on macOS Ventura.

felixcremer commented 5 days ago

I assume this is due to https://github.com/rafaqz/Rasters.jl/pull/608. Also lazy=true should work for all operations, at least that would be my aim. Could you please paste the whole GDALError that you get here?

rafaqz commented 5 days ago

The GDAL error should be unrelated to the memory error.

We'll need a mwe for GDAL.

I suspect Sys.free_memory is a M1 problem, and needs an issue in base Julia. Here we can add a keyword to ignore it like force_memory?

felixcremer commented 5 days ago

Couldn't we make a setting for the available memory which would default to Sys.free_memory but which could be set accordingly, so that you could also restrict it to lower values. Similar to what YAXArrays is doing with the YAXDefaults https://github.com/JuliaDataCubes/YAXArrays.jl/blob/54b7f6de5ead49dac1871879af6bc7c6a3c50a79/src/YAXArrays.jl#L13

global const RASTER_MEM = Ref(Sys.free_memory())

And then use RASTER_MEM for the check in read.

rafaqz commented 5 days ago

Ok that's better. Let's start with a keyword likemax_memory=x in GB (or MB?).

We can plan a preference system too.

Edit: actually the keyword might be annoying to propagate? Maybe a global setting is better. I just feel like we need a clear consistent way to do preferences