Closed Robinlovelace closed 3 years ago
In develop
this is now available. The argument is called length_as_weight
. It will store the length of the edges in a column named weight
. In tidygraph
and igraph
a column with this name is automatically recognized as being the edge weights to be used.
library(sfnetworks)
as_sfnetwork(roxel, length_as_weight = TRUE)
#> Linking to GEOS 3.7.1, GDAL 2.2.2, PROJ 4.9.2
#> # An sfnetwork with 701 nodes and 851 edges
#> #
#> # CRS: EPSG:4326
#> #
#> # A directed multigraph with 14 components with spatially explicit edges
#> #
#> # Node Data: 701 x 1 (active)
#> # Geometry type: POINT
#> # Dimension: XY
#> # Bounding box: xmin: 7.522622 ymin: 51.94151 xmax: 7.546705 ymax: 51.9612
#> geometry
#> <POINT [°]>
#> 1 (7.533722 51.95556)
#> 2 (7.533461 51.95576)
#> 3 (7.532442 51.95422)
#> 4 (7.53209 51.95328)
#> 5 (7.532709 51.95209)
#> 6 (7.532869 51.95257)
#> # … with 695 more rows
#> #
#> # Edge Data: 851 x 6
#> # Geometry type: LINESTRING
#> # Dimension: XY
#> # Bounding box: xmin: 7.522594 ymin: 51.94151 xmax: 7.546705 ymax: 51.9612
#> from to name type geometry weight
#> <int> <int> <fct> <fct> <LINESTRING [°]> [m]
#> 1 1 2 Havixbecke… residen… (7.533722 51.95556, 7.533461 51.955… 28.859…
#> 2 3 4 Pienersall… seconda… (7.532442 51.95422, 7.53236 51.9537… 107.704…
#> 3 5 6 Schulte-Be… residen… (7.532709 51.95209, 7.532823 51.952… 54.362…
#> # … with 848 more rows
Created on 2020-10-29 by the reprex package (v0.3.0)
Great work @luukvdmeer will test asap and likely close :+1:
Pre-latest version I get:
> as_sfnetwork(roxel, length_as_weight = TRUE)
# An sfnetwork with 701 nodes and 851 edges
#
# CRS: EPSG:4326
#
# A directed multigraph with 14 components with spatially explicit edges
#
# Node Data: 701 x 1 (active)
# Geometry type: POINT
# Dimension: XY
# Bounding box: xmin: 7.522622 ymin: 51.94151 xmax: 7.546705 ymax: 51.9612
geometry
<POINT [°]>
1 (7.533722 51.95556)
2 (7.533461 51.95576)
3 (7.532442 51.95422)
4 (7.53209 51.95328)
5 (7.532709 51.95209)
6 (7.532869 51.95257)
# … with 695 more rows
#
# Edge Data: 851 x 5
# Geometry type: LINESTRING
# Dimension: XY
# Bounding box: xmin: 7.522594 ymin: 51.94151 xmax: 7.546705 ymax: 51.9612
from to name type geometry
<int> <int> <fct> <fct> <LINESTRING [°]>
1 1 2 Havixbecker Strasse residential (7.533722 51.95556, 7.533461 51.95576)
2 3 4 Pienersallee secondary (7.532442 51.95422, 7.53236 51.95377, 7.53209 51.95328)
3 5 6 Schulte-Bernd-Strasse residential (7.532709 51.95209, 7.532823 51.95239, 7.532869 51.95257)
Post latest version I get this so the issue is fixed for sure. Great work!
# Aim: test length_as_weight argument
remotes::install_github("luukvdmeer/sfnetworks", "develop")
#> Using github PAT from envvar GITHUB_PAT
#> Skipping install of 'sfnetworks' from a github remote, the SHA1 (7590fd8c) has not changed since last install.
#> Use `force = TRUE` to force installation
library(sfnetworks)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
as_sfnetwork(roxel, length_as_weight = TRUE)
#> Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 7.0.0
#> # An sfnetwork with 701 nodes and 851 edges
#> #
#> # CRS: EPSG:4326
#> #
#> # A directed multigraph with 14 components with spatially explicit edges
#> #
#> # Node Data: 701 x 1 (active)
#> # Geometry type: POINT
#> # Dimension: XY
#> # Bounding box: xmin: 7.522622 ymin: 51.94151 xmax: 7.546705 ymax: 51.9612
#> geometry
#> <POINT [°]>
#> 1 (7.533722 51.95556)
#> 2 (7.533461 51.95576)
#> 3 (7.532442 51.95422)
#> 4 (7.53209 51.95328)
#> 5 (7.532709 51.95209)
#> 6 (7.532869 51.95257)
#> # … with 695 more rows
#> #
#> # Edge Data: 851 x 6
#> # Geometry type: LINESTRING
#> # Dimension: XY
#> # Bounding box: xmin: 7.522594 ymin: 51.94151 xmax: 7.546705 ymax: 51.9612
#> from to name type geometry weight
#> <int> <int> <fct> <fct> <LINESTRING [°]> [m]
#> 1 1 2 Havixbecke… residen… (7.533722 51.95556, 7.533461 51.955… 28.859…
#> 2 3 4 Pienersall… seconda… (7.532442 51.95422, 7.53236 51.9537… 107.704…
#> 3 5 6 Schulte-Be… residen… (7.532709 51.95209, 7.532823 51.952… 54.362…
#> # … with 848 more rows
# bigger example
network = osmextract::oe_get("isle of wight")
nrow(network)
#> [1] 44424
system.time({network_sfn = as_sfnetwork(network)})
#> user system elapsed
#> 2.803 0.032 2.865
system.time({network_sfnw = as_sfnetwork(network, length_as_weight = TRUE)})
#> user system elapsed
#> 3.760 0.032 3.794
head(network_sfnw %>% sfnetworks::activate("edges") %>% pull(weight))
#> Units: [m]
#> [1] 50.93481 637.29974 599.75302 383.01448 371.25535 375.71839
summary(network_sfnw %>% sfnetworks::activate("edges") %>% pull(weight))
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 0.7 33.7 79.0 237.9 185.7 1022009.4
plot(network_sfnw[1:999, ])
#> Error in (function (cl, name, valueClass) : assignment of an object of class "units" is not valid for @'x' in an object of class "dgTMatrix"; is(value, "numeric") is not TRUE
Created on 2020-10-30 by the reprex package (v0.3.0)
Observations: strange error message when doing a silly subset - can we at least make the error message clearer? and weight column has units.
Suggestion. Use as.numeric
before creating the weights: those units can cause lots of problems in my experience, e.g. when subsetting based on length.
Observations: strange error message when doing a silly subset - can we at least make the error message clearer? and weight column has units.
I think that the problem here is that we never extended the [
operator to sfnetwork
objects, so network_sfnw[1:999, ]
is just calling igraph:::[.igraph
. For example:
library(sfnetworks)
roxel_sfnetworks <- as_sfnetwork(roxel)
class(roxel_sfnetworks)
#> [1] "sfnetwork" "tbl_graph" "igraph"
roxel_sfnetworks[1:2, ]
#> 2 x 701 sparse Matrix of class "dgCMatrix"
#>
#> [1,] . 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
#> [2,] . . . . . . . . . . . . . . . . . . . . . . . . . . 1 . . . . . . . . . .
#>
#> [1,] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
#> [2,] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
#>
#> [1,] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
#> [2,] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
#>
#> [1,] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
#> [2,] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
#>
#> [1,] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
#> [2,] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Created on 2020-10-30 by the reprex package (v0.3.0)
The result does make sense since igraph:::[.igraph
says: Query and manipulate a graph as it were an adjacency matrix
. For some reasons that code fails if we add units object but I'm not familiar enough with igraph internals.
Anyway, I think that the best solution here is to keep the units on the weight
field (since that's not the main problem here) and define a new method for [.sfnetwork
(but this is clearly open to discussion since, probably, I'm missing something here 😅 ).
I think that the best solution here is to keep the units on the
weight
field
What are the reasons for doing so? For me it's a matter of +s vs -s.
-s of units m: it can cause unexpected errors when using them, e.g. for filtering the activated objects
+s of units: more faithful to the sf
design and provides additional attribute data
There may be other +s and -s and it's only a mild preference for not adding them based on past experience working with units objects.
I would argue that we rely on tidygraph
for subsetting and all. In the end, sfnetworks
should always be used with tidygraph
and sf
loaded as well, since it is basically a connector between those two packages. The tidygraph
philosophy is to use dplyr
verbs for these kind of operations. Also operators like $
do not work for the same reason.
Why not use the tidy way of plot(slice(net, 1:999))
? Or of course `net %>% slice(1:999) %>% plot()
PS: I reopen the issue since we decided to only close when fixes are on master ;-)
Why not use the tidy way of
plot(slice(net, 1:999))
? Or of course `net %>% slice(1:999) %>% plot()
No reason, was just testing things almost at random and 'kicking the tires' of the objects ti see what happens. Here's another reprex + more observations:
remotes::install_github("luukvdmeer/sfnetworks", "develop")
#> Using github PAT from envvar GITHUB_PAT
#> Skipping install of 'sfnetworks' from a github remote, the SHA1 (7590fd8c) has not changed since last install.
#> Use `force = TRUE` to force installation
library(sfnetworks)
rsfn = as_sfnetwork(roxel)
rsfn2 = as_sfnetwork(roxel, length_as_weight = TRUE)
#> Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 7.0.0
head(rsfn[1:99, ])
#> 6 x 701 sparse Matrix of class "dgCMatrix"
#>
#> [1,] . 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
#> [2,] . . . . . . . . . . . . . . . . . . . . . . . . . . 1 . . . . . . . . . .
#> [6,] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
rsfn2[1:99, ] # inconsistent
#> Error in (function (cl, name, valueClass) : assignment of an object of class "units" is not valid for @'x' in an object of class "dgTMatrix"; is(value, "numeric") is not TRUE
plot(rsfn %>% slice(1:99))
#> Error in slice(., 1:99): could not find function "slice"
plot(rsfn %>% dplyr::slice(1:99))
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
summary(rsfn2 %>% activate("edges") %>% pull(weight))
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 1.956 25.479 45.656 59.369 73.376 513.913
plot(rsfn2 %>% activate("edges") %>% filter(weight > 100))
#> Error: Problem with `filter()` input `..1`.
#> ✖ both operands of the expression should be "units" objects
#> ℹ Input `..1` is `weight > 100`.
plot(rsfn2 %>% activate("edges") %>% filter(as.numeric(weight) > 100))
Created on 2020-10-30 by the reprex package (v0.3.0)
PS: I reopen the issue since we decided to only close when fixes are on master ;-)
Good point, should have remembered that as I recall now advocating that, woops!
Additional thought: the fact that the column is call weight
adds support to coercing it to numeric
I think. People will be more likely to expect a column called length
to be of class units
than the relatively generically named weight
column.
What are the reasons for doing so? For me it's a matter of +s vs -s.
I would preserve the current approach since the units are informative and you can easily drop them but, AFAIK, it's not trivial to recover. Moreover, the problem reported in the previous comment is caused by an erroneous approach, and I think you get the same results with another class that has similar properties (but, unfortunately, I don't understand the internals of igraph
or Matrix
)
Additional thought: the fact that the column is call weight adds support to coercing it to numeric I think. People will be more likely to expect a column called length to be of class units than the relatively generically named weight column.
You have a point there and with the filter example and I agree that it can be annoying 😅
Anyway, I wouldn't drop the units (for the reasons explained above), but it's not a big deal in both scenarios.
was just testing things almost at random and 'kicking the tires' of the objects ti see what happens
Perfect, that is exactly what is needed!
Additional thought: the fact that the column is call weight adds support to coercing it to numeric I think. People will be more likely to expect a column called length to be of class units than the relatively generically named weight column.
That is a good argument.
Remember there is also the spatial edge measure edge_length()
, which can be used inside a mutate
verb and has more flexibility. Could be even an option that users can provide which unit they want (might be more interested in km than m for example), and also turn them off. But adding all such arguments to the construction function is too much I think.
library(sfnetworks)
library(tidygraph)
#>
#> Attaching package: 'tidygraph'
#> The following object is masked from 'package:stats':
#>
#> filter
as_sfnetwork(roxel) %>%
activate("edges") %>%
mutate(length = edge_length())
#> Linking to GEOS 3.7.1, GDAL 2.2.2, PROJ 4.9.2
#> # An sfnetwork with 701 nodes and 851 edges
#> #
#> # CRS: EPSG:4326
#> #
#> # A directed multigraph with 14 components with spatially explicit edges
#> #
#> # Edge Data: 851 x 6 (active)
#> # Geometry type: LINESTRING
#> # Dimension: XY
#> # Bounding box: xmin: 7.522594 ymin: 51.94151 xmax: 7.546705 ymax: 51.9612
#> from to name type geometry length
#> <int> <int> <fct> <fct> <LINESTRING [°]> [m]
#> 1 1 2 Havixbeck… reside… (7.533722 51.95556, 7.533461 51.95576) 28.859…
#> 2 3 4 Pienersal… second… (7.532442 51.95422, 7.53236 51.95377,… 107.704…
#> 3 5 6 Schulte-B… reside… (7.532709 51.95209, 7.532823 51.95239… 54.362…
#> 4 7 8 <NA> path (7.540063 51.94468, 7.539696 51.94479… 155.230…
#> 5 9 10 Welsinghe… reside… (7.537673 51.9475, 7.537614 51.94562) 208.663…
#> 6 11 12 <NA> footway (7.543791 51.94733, 7.54369 51.94686,… 63.023…
#> # … with 845 more rows
#> #
#> # Node Data: 701 x 1
#> # Geometry type: POINT
#> # Dimension: XY
#> # Bounding box: xmin: 7.522622 ymin: 51.94151 xmax: 7.546705 ymax: 51.9612
#> geometry
#> <POINT [°]>
#> 1 (7.533722 51.95556)
#> 2 (7.533461 51.95576)
#> 3 (7.532442 51.95422)
#> # … with 698 more rows
Created on 2020-10-30 by the reprex package (v0.3.0)
This is now part of the new version. See here for an example.
As described here: https://github.com/luukvdmeer/sfnetworks/issues/6#issuecomment-648391948