UrbanAnalyst / dodgr

Distances on Directed Graphs in R
https://urbananalyst.github.io/dodgr/
128 stars 16 forks source link

`dodgr_dists` with numeric node indices gives unexpected results #254

Closed luukvdmeer closed 1 month ago

luukvdmeer commented 1 month ago

Hi Mark! I was running dodgr_dists on a simple graph in which the from and to nodes of the edges are encoded by numeric values, rather than by characters as is used in most of your examples. This gives some strange results which I am not sure if they are intended, see the reprex below:

library(dodgr)

graph <- data.frame (
  from = c (1, 2, 2, 2, 3, 3, 4, 4),
  to = c (2, 1, 3, 4, 2, 4, 3, 1),
  d = c (1, 2, 1, 3, 2, 1, 2, 1)
)

# This does not work.
dodgr_dists(graph, from = 1, to = 2)
#> Error in dimnames(x) <- dn: length of 'dimnames' [2] not equal to array extent

# This works.
dodgr_dists(graph, from = 1L, to = 2L)
#>   2
#> 1 1

# However, I don't think it works as expected.
# For example if we let the second edge start from node 3 instead of node 2.
# Then computing distance between 1 and 2 actually computes between node 1 and node 3.
graph <- data.frame (
  from = c (1, 3, 2, 2, 3, 3, 4, 4),
  to = c (2, 1, 3, 4, 2, 4, 3, 1),
  d = c (1, 2, 1, 3, 2, 1, 2, 1)
)

dodgr_dists(graph, from = 1L, to = 2L)
#>   3
#> 1 2

# For the record: Everything works well when nodes are character encoded.
graph <- data.frame (
  from = as.character (c (1, 3, 2, 2, 3, 3, 4, 4)),
  to = as.character (c (2, 1, 3, 4, 2, 4, 3, 1)),
  d = c (1, 2, 1, 3, 2, 1, 2, 1)
)

dodgr_dists(graph, from = "1", to = "2")
#>   2
#> 1 1

Created on 2024-09-26 with reprex v2.1.1

mpadge commented 1 month ago

Thanks for noting that @luukvdmeer. As background and general advice: One design aim of this package was to be "smart" about from and to arguments, and "cleverly" match them to desired values in any many arbitrary ways as possible. They can be integers, numeric values, character vectors, or matrices of coordinates. I thought that was clever. I would never do that again, and would strongly recommend nobody else do anything like that. The world is much easier when you pretend that R is a rigidly-typed language. That said ...

The error you observed was this simple one-line fix. More importantly, the commits above update the docs to clearly state the intention of from and to values when entered as integer-ish vectors: https://github.com/UrbanAnalyst/dodgr/blob/648129f6e42408b1cb5292f2fa2f2fac6a658dc7/R/dists.R#L49-L59

I also added some tests to demonstrate why you see the results you gave above. The vertex map is made by sequentially crawling the from values given in graph construction. In your first example, these are indeed sequential: 1:4, but in your second, they are 1, 3, 2, 4, so a value of to = 2 will map on to the second of these, and return 3. Hopefully that's now all explained much better in the docs, but let me know if you can see any room for improvement.

Thanks as always for helping to make the package better!

luukvdmeer commented 1 month ago

Yes, this docs now make it clearer! I see it is not easy indeed to try to anticipate all different ways people may specify the from and to nodes ;)

agila5 commented 1 month ago

I thought that was clever. I would never do that again, and would strongly recommend nobody else do anything like that. The world is much easier when you pretend that R is a rigidly-typed language.

I will add this quote to my notes for a class on the R software ahahhaha