IMMM-SFA / gamut

An R package to identify multi-sector teleconnection complexity
https://immm-sfa.github.io/gamut/
Other
0 stars 3 forks source link

Memory constraints for large watersheds #37

Closed swd-turner closed 5 years ago

swd-turner commented 5 years ago

So far I've been unable to run the watershed count function for the cities of New Orleans | LA and Saint Louis | MO. This is because the watersheds are so large that the raster cropping and mapping seems to require a huge amount of memory. Attempted with 2 entire nodes on PIC and failed, so we need a solution. Generally, the raster cropping and mapping is slow for larger watersheds, so anything to identify those cases and speed things up would be valuable.

swd-turner commented 5 years ago

@crvernon the bottleneck is in io.R... function get_raster_val_classes, which is called from get_watershed_teleconnections

If you need to run in PIC (which is handy since this is where the data are stored) you probably want to use R/3.5.1 and load the following modules: $ module load gdal/2.1.2 $ module load proj4/4.9.3 $ module load geos/3.4.2 $ module load gcc/5.2.0

If you want to run a single city, supply something like New Orleans | LA to the cities argument in count_watershed_teleconnections. Please give me a call if anything is unclear!

crvernon commented 5 years ago

@swd-turner I think I found an efficient fix for this. I’m getting on a plane now but will test ASAP.