luukvdmeer / sfnetworks

Tidy Geospatial Networks in R
https://luukvdmeer.github.io/sfnetworks/
Other
338 stars 20 forks source link

add_vertices error inconsistent between networks #173

Closed benjaminhlina closed 2 years ago

benjaminhlina commented 2 years ago

This issue seems to be somewhat related to thomasp85/tidygraph#89. I'm currently trying to use sfnetworks which I understand allows for sf objects to be passed to tidygraph and igraph like functions to create spatial networks. I'm using sfnetworks to determine seasonal fish movement within a mulitbasin lake. Depending on the fish, some networks work and others do not, and the ones that do not always pop up with the following error.

Error in add_vertices(gr, nrow(nodes) - gorder(gr)) : 
  At type_indexededgelist.c:369 : cannot add negative number of vertices, Invalid value

I cannot seem to figure out why the error occurs, but based on the following issue thomasp85/tidygraph#89 this seems to be an issue with the name of the column of the node data which doesn't seem to add up to me, considering mine is called rec_group. If this is more a stackoverflow question then I can move this over there, just thought I'd create an issue report as the error message seems to be related to tidygraph issue thomasp85/tidygraph#89. I have created the following reprex with the following data: I have provided dropbox links to the example data.

I've wondered if this is an issue surrounding the fact that the edge linestring geometry, does not match exactly the node point geometry, hence why I use force = TRUE within my sfnetwork() call. This is due to fact that separately I've done a cost distance analysis between each node which takes in account the lake shape, as I don't want edges on land, and have created a separate sf object with those linestrings and shortest paths. I then create an edge list (to-from) per fish per season and then join this cost distance sf object to each unique to-from combination. This results in the start and ends of the edge linestring geometry being just slightly different than the node sf object geometry. However, I don't believe this is the issue, if I make a straight linestring, across land, for example from node a to b, that matches the node lat and long, I still get the error. You'll also notice that there are a some empty linestrings due to self loops within the network, not sure if this also poses a problem.

Links to network for fish 05550 that succeeds: edges_sf_scd nodes_sf_scd

edges_sf_scd <- readr::read_rds(file = "edges_sf_scd.rds")

nodes_sf_scd <- readr::read_rds(file = "nodes_sf_scd.rds")

net_scd <- sfnetwork(nodes = nodes_sf_scd, 
                     edges = edges_sf_scd, 
                     directed = TRUE, 
                     edges_as_lines = TRUE,
                     node_key = "rec_group",
                     force = TRUE)

net_scd
str(net_scd)

Links to network for fish 05802 that fails : edges_sf_fail nodes_sf_fail

edges_sf_fail <- readr::read_rds(file = "edges_sf_fail.rds")

nodes_sf_fail <- readr::read_rds(file = "nodes_sf_fail.rds")

net_fail <- sfnetwork(nodes = nodes_sf_fail, 
                      edges = edges_sf_fail, 
                      directed = TRUE, 
                      edges_as_lines = TRUE,
                      node_key = "rec_group",
                      force = TRUE)

Error in add_vertices(gr, nrow(nodes) - gorder(gr)) : 
  At type_indexededgelist.c:369 : cannot add negative number of vertices, Invalid value
agila5 commented 2 years ago

Hi @benjaminhlina. I checked the data you provided and I have one question. The following is a printing of the node data:

#> Simple feature collection with 5 features and 1 field
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: 516227.3 ymin: 5069850 xmax: 519880 ymax: 5075320
#> CRS:           +proj=utm +zone=18 +datum=WGS84 +units=m +no_defs
#> # A tibble: 5 x 2
#>   rec_group                    geometry
#>   <chr>                     <POINT [m]>
#> 1 Central East-Basin (518676.6 5072428)
#> 2 North East-Basin   (518209.8 5074428)
#> 3 North West-Basin   (517136.8 5075320)
#> 4 South East-Basin     (519880 5069850)
#> 5 Sucker Creek       (516227.3 5073592)

while the following is a printing of the edges data (I included only the from and to columns):

#> Simple feature collection with 13 features and 2 fields (with 5 geometries empty)
#> Geometry type: LINESTRING
#> Dimension:     XY
#> Bounding box:  xmin: 516170.7 ymin: 5069851 xmax: 519900.7 ymax: 5075321
#> CRS:           +proj=utm +zone=18 +datum=WGS84 +units=m +no_defs
#> # A tibble: 13 x 3
#>     from    to                                                          geometry
#>    <int> <int>                                                  <LINESTRING [m]>
#>  1     1     3 (518210.7 5074426, 518215.7 5074416, 518220.7 5074406, 518215.7 ~
#>  2     3     7 (516225.7 5073591, 516235.7 5073596, 516245.7 5073601, 516255.7 ~
#>  3     3     1 (518675.7 5072426, 518670.7 5072436, 518665.7 5072446, 518660.7 ~
#>  4     7     3 (518210.7 5074426, 518205.7 5074416, 518200.7 5074406, 518195.7 ~
#>  5     1     4 (519880.7 5069851, 519875.7 5069861, 519870.7 5069871, 519865.7 ~
#>  6     8     7 (516225.7 5073591, 516230.7 5073601, 516235.7 5073611, 516240.7 ~
#>  7     4     1 (518675.7 5072426, 518680.7 5072416, 518685.7 5072406, 518690.7 ~
#>  8     7     8 (517135.7 5075321, 517130.7 5075311, 517125.7 5075301, 517120.7 ~
#>  9     3     3                                                             EMPTY
#> 10     4     4                                                             EMPTY
#> 11     1     1                                                             EMPTY
#> 12     8     8                                                             EMPTY
#> 13     7     7                                                             EMPTY

Are you sure that the structure is correct? The from and to columns are specified as integers and correspond to nodes that do not exist in the nodes table (e.g. ids 8 or 7). You get the same problem as the following example where I create an edge between nodes 2 and 3 while there are only 2 nodes in the nodes table:

library(tidygraph)
tbl_graph(
  nodes = data.frame(id = 1:2), 
  edges = data.frame(from = c(1, 2), c(2, 3))
)
#> Error in add_vertices(gr, nrow(nodes) - gorder(gr)): At type_indexededgelist.c:369 : cannot add negative number of vertices, Invalid value

Created on 2021-09-16 by the reprex package (v2.0.1)

Hope it's clear.

benjaminhlina commented 2 years ago

That makes sense, I apologize for not having the data line up properly. I have amended both the node and edge files that failed, so that the id column in node_sf_fail matches both the from and to column in edges_sf_fail. I still kept getting the error but when I switch the id, from, and to columns from integers to characters in both node and edges objects it all works. I misunderstood both the class each of these columns needed to be as well as what the argument node_key does, as it tells sfnetwork what column to look for node ids if it is not the first column. For some reason that wasn't clear when I read the helpfile which is my own fault as the helpfile is quite clear ha. Thank you for checking this out for me, as I was very unsure what the error message was telling me. Usually I can figure out what the error is and/or find examples but there seems to be very limited info on errors surrounding tidygraph. I have closed the issue.

Side comment: I know the package isn't there yet but some sort of integration with ggraph or having the ability to put arrows on the edges to indicate directionality would be awesome! Otherwise the package is great, thank you so much for you help and the development of the package!

agila5 commented 2 years ago

I still kept getting the error but when I switch the id, from, and to columns from integers to characters in both node and edges objects it all works. ... I misunderstood both the class each of these columns needed to be as well as what the argument node_key does, as it tells sfnetwork what column to look for node ids if it is not the first column.

Just to be a little more clear than the previous example, I will copy one part of the first introductory vignette: Instead of from and to columns containing integers that refer to node indices, the provided edges table can also have from and to columns containing characters that refer to node keys. In that case, you should tell the construction function which column in the nodes table contains these keys. Internally, they will then be converted to integer indices. Please check also the examples in the first vignette and check that those character columns link the right nodes.

For some reason that wasn't clear when I read the helpfile which is my own fault as the helpfile is quite clear ha.

No worries, that happens to me all the time 😅

Side comment: I know the package isn't there yet but some sort of integration with ggraph or having the ability to put arrows on the edges to indicate directionality would be awesome! Otherwise the package is great, thank you so much for you help and the development of the package!

Check this blogpost by @loreabad6 (and, in particular, the argument arrow in geom_edge_link)!

benjaminhlina commented 2 years ago

Thank you for providing more detail on how the argument node_key works. Pretty slick that sfnetwork can handle character strings for nodes and from and to in the edge table. The names of the locations are more important to me as I'm trying to see how fish deal with fragmented habitat and those habitats have characters names, which are more important than the arbitrary number I assign them. I think somewhere I got confused between multiple network packages (igraph, tidygraph ect) to the point that I was mismatching integers and characters as seen above. Thanks again for catching all of this!

I had found a blog post that uses ggmap and ggraph to map networks but it isn't quite what I was looking for especially as I wanted to use sf objects.

Thanks for sharing @loreabad6 blog post as well, that is exactly what I'm looking for! The post itself is great and is super helpful! Thanks @loreabad6 for your work on this :)