Closed darinchristensen closed 4 years ago
Hmmm, I fear that's going to be a little tricky. I'm sure your error is genuine, and i'd wager that it indeed reflects an internal dodgr
error. The problem is in feeding irregular structures through to weight_streetnet()
- it really has been fairly rigidly coded for all OSM possibilities, but hardly tested for possibilities beyond that. And so my first request back to you would be whether you might be able to contrive a smaller example that prompts this error, and to provide a reprex
of the whole thing, including downloading or otherwise accessing the data set which caused it? (It may even somehow be OSM itself, which also has no guarantee of being free of errors, being user-generated as it is.)
In the meantime, one guess at a workaround might be to first call dodgr_cache_off()
prior to weight_streetnet()
? Even if that is so, some reproducible code will still be necessary to uncover what's going on here. Thanks for the input!
Thanks @mpadge for the quick response!
First, I tried calling dodgr_cache_off()
before weight_streetnet()
but no luck.
Second, despite many attempts, I can't reproduce the error in a sub-sample. You'll see below that the function successfully runs on two exhaustive subsets of the data: one using only roads from OSM; another using those roads from non-OSM sources.
I've included a reprex below. The read_rds
call uses a Dropbox link to the data.
rm(list = ls())
library(tidyverse)
library(sf)
#> Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 7.0.0
library(dodgr)
library(reprex)
rds_shp <- read_rds(url("<removed>"))
(rds_shp)
#> Simple feature collection with 217853 features and 7 fields
#> geometry type: LINESTRING
#> dimension: XY
#> bbox: xmin: -17.51197 ymin: -4.997724 xmax: 37.73438 ymax: 37.30094
#> geographic CRS: WGS 84
#> First 10 features:
#> fclass length NAME_0 NAME_1 NAME_2 source id
#> 1 trunk 27.8370885 Benin Atlantique Abomey-Calavi osm BEN-1
#> 2 trunk 27.8370885 Benin Atlantique Ouidah osm BEN-2
#> 3 trunk 0.3558411 Benin Littoral Cotonou osm BEN-3
#> 4 trunk 1.4257407 Benin Littoral Cotonou osm BEN-4
#> 5 primary 12.2490001 Benin Plateau Kétou osm BEN-5
#> 6 primary 6.5046034 Benin Plateau Kétou osm BEN-6
#> 7 trunk 0.2505886 Benin Mono Grand-Popo osm BEN-7
#> 8 primary 0.2061700 Benin Mono Comè osm BEN-8
#> 9 trunk 1.3588512 Benin Mono Comè osm BEN-9
#> 10 trunk 0.0452230 Benin Mono Comè osm BEN-10
#> geometry
#> 1 LINESTRING (2.351141 6.3883...
#> 2 LINESTRING (2.351141 6.3883...
#> 3 LINESTRING (2.438702 6.3713...
#> 4 LINESTRING (2.45154 6.37201...
#> 5 LINESTRING (2.610743 7.3667...
#> 6 LINESTRING (2.689136 7.4068...
#> 7 LINESTRING (1.810371 6.2867...
#> 8 LINESTRING (1.882023 6.4033...
#> 9 LINESTRING (1.943153 6.3886...
#> 10 LINESTRING (1.939842 6.3890...
# _______________________________________________________________
# confirm that all road types are reflected in the weighting profile:
t1 <- rds_shp %>% tbl_df() %>% count(fclass)
t2 <- weighting_profiles$weighting_profiles %>% filter(name == "motorcar")
wt_df <- left_join(t1, t2, by = c("fclass" = "way"))
#> Warning: Column `fclass`/`way` joining factor and character vector, coercing
#> into character vector
rm(t1, t2)
(wt_df)
#> # A tibble: 12 x 5
#> fclass n name value max_speed
#> <chr> <int> <chr> <dbl> <dbl>
#> 1 motorway 8531 motorcar 1 90
#> 2 motorway_link 7534 motorcar 1 45
#> 3 primary 59379 motorcar 0.9 65
#> 4 primary_link 9868 motorcar 0.9 30
#> 5 secondary 71569 motorcar 0.8 55
#> 6 secondary_link 7494 motorcar 0.8 25
#> 7 service 3 motorcar 0.4 15
#> 8 tertiary 4681 motorcar 0.7 40
#> 9 tertiary_link 156 motorcar 0.7 20
#> 10 trunk 35163 motorcar 1 85
#> 11 trunk_link 12714 motorcar 1 40
#> 12 unclassified 761 motorcar 0.6 25
# _______________________________________________________________
# running on two exhaustive sub-sets of rds_shp:
dodgr_cache_off()
# restricing to roads from OSM:
(osm_rds <- rds_shp %>% filter(source == "osm") %>% select(id, fclass))
#> Simple feature collection with 215558 features and 2 fields
#> geometry type: LINESTRING
#> dimension: XY
#> bbox: xmin: -17.51197 ymin: -4.997724 xmax: 37.73438 ymax: 37.30094
#> geographic CRS: WGS 84
#> First 10 features:
#> id fclass geometry
#> 1 BEN-1 trunk LINESTRING (2.351141 6.3883...
#> 2 BEN-2 trunk LINESTRING (2.351141 6.3883...
#> 3 BEN-3 trunk LINESTRING (2.438702 6.3713...
#> 4 BEN-4 trunk LINESTRING (2.45154 6.37201...
#> 5 BEN-5 primary LINESTRING (2.610743 7.3667...
#> 6 BEN-6 primary LINESTRING (2.689136 7.4068...
#> 7 BEN-7 trunk LINESTRING (1.810371 6.2867...
#> 8 BEN-8 primary LINESTRING (1.882023 6.4033...
#> 9 BEN-9 trunk LINESTRING (1.943153 6.3886...
#> 10 BEN-10 trunk LINESTRING (1.939842 6.3890...
osm_graph <- weight_streetnet(
x = osm_rds,
type_col = "fclass",
id_col = "id",
wt_profile = "motorcar"
)
tbl_df(osm_graph)
#> # A tibble: 16,388,530 x 15
#> geom_num edge_id from_id from_lon from_lat to_id to_lon to_lat d
#> <dbl> <int> <chr> <dbl> <dbl> <chr> <dbl> <dbl> <dbl>
#> 1 1 1 1 2.35 6.39 2 2.35 6.39 96.0
#> 2 1 2 2 2.35 6.39 1 2.35 6.39 96.0
#> 3 1 3 2 2.35 6.39 3 2.35 6.39 21.0
#> 4 1 4 3 2.35 6.39 2 2.35 6.39 21.0
#> 5 1 5 3 2.35 6.39 4 2.35 6.39 32.8
#> 6 1 6 4 2.35 6.39 3 2.35 6.39 32.8
#> 7 1 7 4 2.35 6.39 5 2.35 6.39 36.5
#> 8 1 8 5 2.35 6.39 4 2.35 6.39 36.5
#> 9 1 9 5 2.35 6.39 6 2.35 6.39 17.7
#> 10 1 10 6 2.35 6.39 5 2.35 6.39 17.7
#> # … with 16,388,520 more rows, and 6 more variables: d_weighted <dbl>,
#> # highway <chr>, way_id <chr>, component <int>, time <dbl>,
#> # time_weighted <dbl>
# restricing to roads NOT from OSM:
(nonosm_rds <- rds_shp %>% filter(source != "osm") %>% select(id, fclass))
#> Simple feature collection with 2295 features and 2 fields
#> geometry type: LINESTRING
#> dimension: XY
#> bbox: xmin: 0.2206525 ymin: 7.456352 xmax: 22.9212 ymax: 23.606
#> geographic CRS: WGS 84
#> First 10 features:
#> id fclass geometry
#> 1 NER-3104 primary LINESTRING (7.377661 17.789...
#> 2 NER-3105 primary LINESTRING (7.377661 17.789...
#> 3 NER-3106 primary LINESTRING (7.377661 17.789...
#> 4 NER-3107 primary LINESTRING (7.429513 17.458...
#> 5 NER-3108 primary LINESTRING (7.925001 16.927...
#> 6 NER-3109 primary LINESTRING (7.925001 16.927...
#> 7 NER-3110 primary LINESTRING (7.988597 16.989...
#> 8 NER-3111 primary LINESTRING (6.504773 15.724...
#> 9 NER-3112 primary LINESTRING (6.504773 15.724...
#> 10 NER-3113 primary LINESTRING (6.504773 15.724...
nonosm_graph <- weight_streetnet(
x = nonosm_rds,
type_col = "fclass",
id_col = "id",
wt_profile = "motorcar"
)
tbl_df(nonosm_graph)
#> # A tibble: 1,118,242 x 15
#> geom_num edge_id from_id from_lon from_lat to_id to_lon to_lat d
#> <dbl> <int> <chr> <dbl> <dbl> <chr> <dbl> <dbl> <dbl>
#> 1 1 1 1 7.38 17.8 2 7.37 17.9 12955.
#> 2 1 2 2 7.37 17.9 1 7.38 17.8 12955.
#> 3 1 3 2 7.37 17.9 3 7.34 18.1 19173.
#> 4 1 4 3 7.34 18.1 2 7.37 17.9 19173.
#> 5 1 5 3 7.34 18.1 4 7.33 18.2 13379.
#> 6 1 6 4 7.33 18.2 3 7.34 18.1 13379.
#> 7 1 7 4 7.33 18.2 5 7.32 18.3 10446.
#> 8 1 8 5 7.32 18.3 4 7.33 18.2 10446.
#> 9 1 9 5 7.32 18.3 6 7.37 18.3 8204.
#> 10 1 10 6 7.37 18.3 5 7.32 18.3 8204.
#> # … with 1,118,232 more rows, and 6 more variables: d_weighted <dbl>,
#> # highway <chr>, way_id <chr>, component <int>, time <dbl>,
#> # time_weighted <dbl>
# _______________________________________________________________
# runningo on the entire data:
full_graph <- weight_streetnet(
x = rds_shp,
type_col = "fclass",
id_col = "id",
wt_profile = "motorcar"
)
#> Error in `$<-.data.frame`(`*tmp*`, "from_id", value = c("1", "2", "2", : replacement has 17507248 rows, data has 17506772
Created on 2020-04-30 by the reprex package (v0.3.0)
great, thanks for the link - i'll have a look into it asap, and let you know when i'm done so you can delete the link.
Thanks so much for looking into this! Curious if you've turned up any clues or workarounds.
Okay, I can confirm that i can reproduce the error. Now to dig down to the cause ... might take a couple of days. In the meantime, I note that
length (which (duplicated (rds_shp$geometry)))
#> 16479
I'm not sure what happens with duplicated geometries like that, to be honest. OSM data when passed through the osmdata
package have all official OSM ID values attached to all ways as well as all nodes, so identical geometries are ascertained by comparing ID values. In their absence (like in your data), "identical" is simply numerically identical, but I'm not sure what happens with those. That could be part of what's going on here. I'll dig some more and report back.
That commit should fix things - please give a try, and re-open if it does not work. The problem was indeed related to using numeric (coordinate) values to determine unique vertices. This relied on the base::merge()
function but using paste
-ed numeric values to generate character names for coordinate pairs, which can generate unpredictable results. The above push implements a new C++ version of the same, but converting coordinates to strings via a fixed precision. This should be fail-safe.
Unrelated other thoughts: Many roads in the area you're looking at are unpaved. Standard routing using distances will not account for effects of surfaces, but dodgr
also allows time-based routing which does take such things into account. To use that, you have to either re-generate the initial data set via osmdata::osmdata_sc()
-> dodgr::weight_streetnet()
(which will automatically generate appropriate surface-dependent speeds), or look at weighting_profiles$surface_speeds
to get the values and either manually construct time
and time_weighted
columns, or adjust the d_weighted
column to reflect maximal speeds on unpaved surfaces. In case that helps.
It works! Thanks so much for your help with this!
Thanks also for the note about unpaved roads and your suggestions on that front.
Thanks for uncovering a pretty important bug. All of my personal dodgr
use is always via osmdata
, so i don't personally get to test cases like this. Input like yours is particularly helpful to ensure the package works for other cases too.
Background
We've constructed a shapefile of major roads across West and Central Africa. This is based largely on OSM data, but is supplemented with other regional sources (e.g., OCHA). (An image is included at the bottom of this issue.)
I've confirmed that all geometries are LINESTRING.
I've also checked that all fclass (i.e., road type) categories have an entry in the weighting profile for motorcars:
This also shows the distribution of road types.
Error
When I try to use
weight_streetnet
I get the following error:Note that the function actually runs fine on subsets of the data:
Plot of sf object (
rds_shp
)Session Info