luukvdmeer / sfnetworks

Tidy Geospatial Networks in R
https://luukvdmeer.github.io/sfnetworks/
Other
345 stars 20 forks source link

The future of sfnetworks #4

Closed luukvdmeer closed 4 years ago

luukvdmeer commented 5 years ago

Building on the r-spatial blogpost on tidy spatial networks and the related issue in the spnethack repository, it seems to be time to discuss how to proceed with the sfnetworks package.

Firstly, it is important to say again that this package was originally developed as a homework assignment for an R class. As you may know, when doing a master study, there are a lot of parallel courses that all have homework assignments, and you don't always have (or make..) enough time for each of them ;-) Therefore, some parts of the package were created as easy ways to meet the requirements of the assignment (which included a minimum number of classes, and a minimum number of functions). For example, the sfn_route class, and some of the functions, are not really an added value in my point of view, but just had to be there. Long story short: there is a need to clean this up, and remove everything that is unnecessary.

Secondly, during the time I wrote the package, I was experimenting with the tidyverse, and therefore, all the source code is written in tidyverse style, including dplyr verbs and pipes. I think this is bad practice, and would like to rewrite this as much as possible in just base R.

Thirdly, the package was written before I knew the tidygraph package. In the blogpost, we showed that tidygraph can be a nice way to work with tidy spatial networks. That connection between sf and tidygraph should be the main building block of the package. One of the questions that arises then, is if there is still a need for a sfnetwork class, or that using tbl_graph for network operations, and 'escaping' to sf for spatial operations, is enough.

I would argue that a sfnetwork class that subclasses tbl_graph is the best way to go. Now, tidygraph works in a way that you use the activate() verb to specify if you want to work on the nodes or the edges, and then, you can use most of the dplyr verbs as if your nodes/edges were tibbles (and also use columns from the other table if needed). For example (Note: I will use the graph object from the blogpost in the examples):

graph %>%
  activate(nodes) %>%
  mutate(nodeIDplusOne = nodeID + 1)
  activate(edges) %>%
  mutate(edgeIDplusNodeID = edgeID + .N()$nodeID)

Some verbs that reduce the number of nodes/edges, also have effects outside of the currently active data. For example, when using filter() and slice() on the nodes, it will also remove the edges terminating at the removed nodes. The same thing happens when doing any kind of dplyr *_join(). For example, if you have a subset of your nodes as a tibble, you can do a right join with the nodes table of the graph. Then, only the nodes of your subset remain, and also only the edges that connect to these nodes.

This behaviour would be very useful when doing geometric operations on the network. For example, I could think of 'escaping' the nodes to an sf object, intersecting them with a polygon, joining the result back in, and then also keep only those edges that connect to the 'new' set of nodes. Something like this:

polygon = st_sfc(
  st_polygon(list(rbind(c(7.62, 51.95), c(7.63, 51.95), c(7.63, 51.96), c(7.62, 51.96), c(7.62, 51.95))),
  crs = st_crs(graph %>% activate(nodes) %>% as_tibble() %>% st_as_sf()
)

intersection = graph %>%
  activate(nodes) %>%
  as_tibble() %>%
  st_as_sf() %>%
  st_intersection(polygon)

new_graph = graph %>%
  activate(nodes) %>%
  st_join(intersection)

However, this does not work, because st_join does not know what to do with an object of class tbl_graph. Using a dplyr right_join with the nodeID column also fails, even when converting the intersection object to a tibble first.

new_graph = graph %>%
  activate(nodes) %>%
  right_join(intersection, by = "nodeID")

new_graph = graph %>%
  activate(nodes) %>%
  right_join(as_tibble(intersection), by = "nodeID")

Only when we remove the geometry list column from both the nodes table as the intersection object, the join works. But of course, we don't want to do that ;-)

The solution here would be that when activating the nodes or edges, you don't analyse them as being a tibble, but as being an sf object. Then, things like this would work, without any need to escape first to an sf object and (if it even works) merging the resuts back in:

graph %>%
  activate(nodes) %>%
  st_intersection(polygon) # This will also keep only those edges that connect to the 'intersecting nodes'

graph %>%
  activate(edges) %>%
  mutate(length = st_length(.))

Also, the dplyr verbs like select() will use their sf method. Hence, sticky geometry will also work in the graph structure. A sfnetwork class that subclasses tbl_graph could implement this. We could also think of, for example, special methods for st_transform, that transforms both the nodes and edges at once, and for st_as_sf , that enables to escape to an sf object without having to go through a tibble first.

For now, I will release the current version of sfnetworks as v0.1.0, such that people already using it can keep using it in the way it is, and then (try to) implement the things mentioned above in a new version 0.2.0, before considering uploading it to CRAN.

Discussion and feedback is welcome! @robinlovelace @loreabad6

Robinlovelace commented 5 years ago

Thanks for the detailed update @luukvdmeer, this sounds like a good plan! Cc @agila5 who has been looking at spatial networks in stplanr. I'm interested in the possibility of replacing the class definition sfNetwork in stplanr with sfnetwork in sfnetworks if it gets onto CRAN and shows real benefit so would like to support this work. Big 👍 from me.

We have been developing a series of tests for spatial networks that I think could help your developments. Great you're up for taking this forward, and fully support your plan to make the changes before submitting this to CRAN.

loreabad6 commented 5 years ago

It makes a lot of sense and I think it should be a good way of developing the package!

@Robinlovelace:

1) do you have the tests you've been working on documented somewhere? It would be interesting to take a look! 2) we were thinking of submitting a proposal to the R-Consortium to further develop the package and make it ready for CRAN. Do you think this would make sense for the aim of this package?

Robinlovelace commented 5 years ago
  1. Yes we have just added some examples to stplanr and have some larger ones for tests on the Ilse of Wight and Leeds, where we're exploring the potential for spatial networks to estimate traffic flow.

  2. I think that's a good idea and would happily contribute. Are you looking for collaborators on the proposal? Network cleaning and preparation are important steps in the analysis of spatial networks, as shown in the blog post. @agila5 (Andrea Gilardi, PhD student at the University of Milan) and I have recently compared the break tool in GRASS's v.clean function with other ways of preparing the network and have written a function rnet_breakup_vertices() that would complement a package on spatial networks nicely I think, and potentially remove the reliance on rgrass7 for network cleaning. We have plenty of use cases and application in mind so confident we can provide useful input into the proposal and its outputs.

We've done some recent work on spatial network analysis and data preparation. The reprex below, for example, shows 3 different cases of network cleaning in preparation for subsequent spatial network creation and analysis steps.

  1. A roundabout is represented as a circular linestring that spatial network representations would struggle to route on.

  2. An overpass that the break tool would break-up but which should not really be broken at all line intersections because there is a 'grade seperation' (height difference) between the different roads.

  3. A real world example of a cycleway in OSM that intersects with road vertices that does need to be represented as separate linestring from a spatial network perspective.

devtools::install_github("ropensci/stplanr")
#> Skipping install of 'stplanr' from a github remote, the SHA1 (982c7f08) has not changed since last install.
#>   Use `force = TRUE` to force installation
library(stplanr)
#> Registered S3 method overwritten by 'R.oo':
#>   method        from       
#>   throw.default R.methodsS3
library(sf)
#> Linking to GEOS 3.5.1, GDAL 2.1.2, PROJ 4.9.3
# Check for roundabout
par(mar = rep(0, 4))
plot(rnet_roundabout$geometry, lwd = 2, col = rainbow(nrow(rnet_roundabout)))

rnet_roundabout_clean <- rnet_breakup_vertices(rnet_roundabout)
#> Splitting rnet object at the intersection points between nodes and internal vertexes
plot(rnet_roundabout_clean$geometry, lwd = 2, col = rainbow(nrow(rnet_roundabout_clean)))

# Check for overpasses
plot(rnet_overpass$geometry, lwd = 2, col = rainbow(nrow(rnet_overpass)))

rnet_overpass_clean <- rnet_breakup_vertices(rnet_overpass)
#> Splitting rnet object at the intersection points between nodes and internal vertexes
plot(rnet_overpass_clean$geometry, lwd = 2, col = rainbow(nrow(rnet_overpass_clean)))

mapview::mapview(rnet_overpass_clean)

# Check for intersection with no node
plot(rnet_cycleway_intersection$geometry, lwd = 2,
     col = rainbow(nrow(rnet_cycleway_intersection)))

rnet_cycleway_intersection_clean <- rnet_breakup_vertices(rnet_cycleway_intersection)
#> Splitting rnet object at the duplicated internal vertexes
plot(rnet_cycleway_intersection_clean$geometry,
     lwd = 2, col = rainbow(nrow(rnet_cycleway_intersection_clean)))

Created on 2019-10-02 by the reprex package (v0.3.0)

Happy to contribute these and more examples to the project, think a self-standing spatial network package is in order!

luukvdmeer commented 4 years ago

I think we have reached far enough now to close this issue ;) Specific feature requests (for example regarding cleaning of networks) could become an issue on their own.