mbtyers / riverdist

River Network Distance Computation and Applications
23 stars 1 forks source link

line2network Error: cannot allocate vector of size 4173.5 Gb #20

Open akarolinamoreno opened 3 months ago

akarolinamoreno commented 3 months ago

Hi,

I have a .shp file of rivers that I downloaded from HydroSHEDS and clipped to my area of interest (8 level 3 basins in South America). I am trying to calculate the distance between several pairs of points, so I loaded the file as an sf object (LINESTRING containing 168,434 observations) in R. When I use the line2network function, it throws an error:

sf <- sf::read_sf(dsn = dir_hydrosheds, layer = "hydroriver") rivs <- line2network(sf = sf, reproject = 5070)

Units: metre Error: cannot allocate vector of size 4173.5 Gb

How can I solve this problem? When I load only a small portion of the file, I can use the function, but for such a large network, it throws this error and I cannot proceed.

mbtyers commented 3 months ago

Hi Ana,

You've unfortunately found a major limitation of the riverdist package. Often, .shp files are made of a very large number of line segments. When a rivernetwork object is imported, line2network calculates a matrix to store the connectivity between every segment and every other segment - when the number of segments N is large, the resulting NxN matrix is quadratically large and overflows the computer's memory. I keep meaning to implement a more memory-efficient way to store the connectivity information, but unfortunately haven't gotten to it.

In the meantime, you can work with subsets of your river network if appropriate (as you've discovered), or if you have access to GIS (or some other way to edit your .shp or LINESTRING before importing) it may be very useful to dissolve linear features if possible (combine all the little separate line segments into a smaller number of larger line segments).

If you have 8 river basins and they're NOT flow-connected, you might want to work with these separately anyway, as riverdist's tools assume a single flow-connected network.

Apologies for the delay in seeing your message! Wish I could offer a better solution than these workarounds,

Matt