fatiando / rockhound

NOTICE: This library is no longer being developed. Use Ensaio instead (https://www.fatiando.org/ensaio). -- Download geophysical models/datasets and load them in Python
BSD 3-Clause "New" or "Revised" License
34 stars 15 forks source link

Decrease memory consumption when reading Bedmap2 dataset #44

Closed santisoler closed 5 years ago

santisoler commented 5 years ago

Description of the desired feature

When reading the Bedmap2 dataset a lot of memory is consumed because the entire file is loaded into memory. This may cause system crashes and/or very annoying situations.

One way to overcome this situation is using Dask to read larger-than-memory data dividing it into chunks.

@leouieda I think you had more solutions in mind. Would you like to discuss them?

Are you willing to help implement and maintain this feature? Yes

leouieda commented 5 years ago

I thought of storing a netcdf copy of the data instead of the tiff but if open_rasterio can do dask arrays, then we probably don't need it.