acteng / overline

Aggregation of overlapping lines and values to build route networks
4 stars 0 forks source link

API questions #1

Closed dabreegster closed 1 year ago

dabreegster commented 1 year ago

1) Does the output need to distinguish direction? If two inputs cross the same roads in opposite directions, do we keep those as separate or not?

2) What should happen to the blue bit here? olsketch The black line only has two points, but the red input splits it at a few points. Presumably we want to split the black line at the red points, too. Relatedly, are there any known edge cases with bridges/tunnels that might get grouped together incorrectly? Imagine two lines on top of each other, but with different z-levels. Those should not get grouped, in my mind. Maybe there needs to be an option to consider some attribute on the input as a unique grouping key? Or maybe if we group by OSM way ID or similar, that happens for free anyway.

Robinlovelace commented 1 year ago
  1. Does the output need to distinguish direction? If two inputs cross the same roads in opposite directions, do we keep those as separate or not?

Not for use cases I've seen. In cases where you had 2 different directions, e.g. for morning and afternoon rush hour, you could run overline() twice, resulting in 2 route networks.

Robinlovelace commented 1 year ago

2. Presumably we want to split the black line at the red points, too.

For sure.

Robinlovelace commented 1 year ago

2. Relatedly, are there any known edge cases with bridges/tunnels that might get grouped together incorrectly?

Not that I know of. There would have to be grade separated parallel geometries on top of each other for that to happen, unusual enough (and I think in breach of OSM guidelines) to not worry about is my guess.

Robinlovelace commented 1 year ago

2. Those should not get grouped, in my mind. Maybe there needs to be an option to consider some attribute on the input as a unique grouping key? Or maybe if we group by OSM way ID or similar, that happens for free anyway.

Any examples of that on OSM? Given the use case is deciding where to invest, and a single piece of infrastructure at one level can take people where they want to go, I think the risk of incorrect policy conclusions arising from this is vanishingly small.

Robinlovelace commented 1 year ago

OK progress on another example, generated by ATIP. Imagine 2 people going from ITS to the Chemic Tavern and then 1 person going from the University of Leeds Union to the skate park. If those are the only trips that take place in the universe, how many people travel on different parts of the network?

image

Resulting GeoJSON below, I plan to make this into an example:

``` { "type": "FeatureCollection", "features": [ { "geometry": { "coordinates": [ [ -1.5580339004009853, 53.80777049991803 ], [ -1.5586702003875674, 53.80786559958057 ], [ -1.5586084000750748, 53.8079996002985 ], [ -1.5586539002072408, 53.808007300290086 ], [ -1.558654900060544, 53.80814440007027 ], [ -1.5580040000980424, 53.80930370026401 ], [ -1.5580010005381328, 53.80931080040809 ], [ -1.5580477001917352, 53.80931760017875 ], [ -1.5581299997659, 53.80933250013912 ], [ -1.5581446995758863, 53.80933519990258 ], [ -1.5582120996719317, 53.809349500015436 ], [ -1.558245001047422, 53.809355299740474 ], [ -1.5583453993277494, 53.809372800538924 ], [ -1.5582356000061517, 53.80958480031906 ], [ -1.5581490000039364, 53.809701599712625 ], [ -1.5580518992730812, 53.809808100075166 ], [ -1.5578970006682213, 53.80995459956477 ], [ -1.5576968000866915, 53.81011230010498 ], [ -1.5573934996544865, 53.810287201070224 ], [ -1.5571107015092294, 53.81041219957888 ], [ -1.5568500998047763, 53.81052329997143 ], [ -1.5563710006571354, 53.81071019968554 ], [ -1.5564924003947431, 53.81086800005045 ], [ -1.556455100269774, 53.81088449990395 ], [ -1.5563387996327491, 53.81093810037098 ], [ -1.5562410000634275, 53.810918999679394 ], [ -1.5561480999265558, 53.81091009999278 ], [ -1.5561903993180415, 53.81108600010433 ], [ -1.5562650994020468, 53.8113964000573 ], [ -1.5563168996688694, 53.811604200005526 ], [ -1.5563242994908957, 53.811631201237425 ], [ -1.556137600710553, 53.811785700192715 ], [ -1.5560577001792608, 53.81177230030079 ], [ -1.5559063003046436, 53.81174680003665 ], [ -1.5557954996171053, 53.81208170020813 ], [ -1.5558388003745316, 53.81211789970141 ], [ -1.5557688000548429, 53.812303300246406 ], [ -1.5550988000202381, 53.812143700338964 ], [ -1.5550271993302092, 53.81223789978058 ], [ -1.5548461003334193, 53.812361200370034 ], [ -1.5546102998313267, 53.812356799989416 ], [ -1.5543725994566941, 53.81246249995573 ], [ -1.5544025995937036, 53.81258569982117 ], [ -1.5543693003945847, 53.812735301070994 ], [ -1.5541605004695918, 53.81275010030734 ], [ -1.5541312006836856, 53.81309249992169 ], [ -1.5541128992841156, 53.813388300287706 ], [ -1.5534067003245486, 53.81336610053386 ], [ -1.5533976002981156, 53.8136517997196 ], [ -1.5533805997666854, 53.813817100426704 ], [ -1.55312649937766, 53.813907000211714 ], [ -1.5530054006548892, 53.81389639990795 ], [ -1.553015299353855, 53.81392350006522 ], [ -1.5531734002120328, 53.81401699983462 ], [ -1.5532834022270383, 53.81405660056222 ], [ -1.553458999761704, 53.81413430015033 ], [ -1.5531781014889865, 53.814264200861054 ], [ -1.5530037004504824, 53.81433680039634 ] ], "type": "LineString" }, "properties": { "length_meters": 1099.9489, "waypoints": [ { "lat": 53.80777049991803, "lon": -1.5580339004009853, "snapped": true }, { "lat": 53.81433680039634, "lon": -1.5530037004504824, "snapped": true } ], "intervention_type": "route", "description": "3", "name": "ITS to Chemic" }, "type": "Feature", "id": 1 }, { "geometry": { "coordinates": [ [ -1.5565239993894555, 53.80706729966995 ], [ -1.5568463000596966, 53.807114300015726 ], [ -1.556800001254998, 53.80722100002766 ], [ -1.5567606000767182, 53.80736350023408 ], [ -1.556794300124741, 53.8074251001668 ], [ -1.5567561998146016, 53.80755490015351 ], [ -1.5572165997390433, 53.807629399955346 ], [ -1.557118500667522, 53.80787259989995 ], [ -1.5572389005518266, 53.80789529967661 ], [ -1.557190600527884, 53.808006900091975 ], [ -1.5571772000734008, 53.80803800043175 ], [ -1.5571635999507842, 53.808053600064326 ], [ -1.5571555996117208, 53.8080678012518 ], [ -1.5572157995538731, 53.80811190038607 ], [ -1.557287400243902, 53.80820020117727 ], [ -1.557392900650087, 53.80836419966669 ], [ -1.5575072999019346, 53.80873259956799 ], [ -1.557649000745719, 53.80897059963507 ], [ -1.5576550996996053, 53.80906799976228 ], [ -1.557601001736614, 53.80921290025807 ], [ -1.5574492993345221, 53.80953989988924 ], [ -1.5574335996712327, 53.80957830002249 ], [ -1.557418999695313, 53.80960990038508 ], [ -1.5574763997456886, 53.80961760037667 ], [ -1.5577288997643741, 53.80979379996231 ], [ -1.557787800351023, 53.80978869990948 ], [ -1.5578251004759927, 53.80993620034403 ], [ -1.5578970006682213, 53.80995459956477 ], [ -1.5576968000866915, 53.81011230010498 ], [ -1.5573934996544865, 53.810287201070224 ], [ -1.5571107015092294, 53.81041219957888 ], [ -1.5568500998047763, 53.81052329997143 ], [ -1.5563710006571354, 53.81071019968554 ], [ -1.5565659992787415, 53.81077689967108 ], [ -1.55666219999036, 53.81085450033382 ], [ -1.5573022997220218, 53.81134719997088 ], [ -1.5574701011236693, 53.81128440034314 ], [ -1.5576249997285292, 53.81136910025063 ], [ -1.5585615007533393, 53.81210349976386 ], [ -1.5591110993910022, 53.812526000155 ], [ -1.5596431015181103, 53.81291959994702 ], [ -1.5608353002770117, 53.81385809959939 ], [ -1.5609238999858333, 53.81376650009654 ], [ -1.5609959000121287, 53.81350549987939 ], [ -1.5610221004040583, 53.813503999810955 ] ], "type": "LineString" }, "properties": { "length_meters": 1040.5132, "waypoints": [ { "lat": 53.80706729966995, "lon": -1.5565239993894555, "snapped": true }, { "lat": 53.813503999810955, "lon": -1.5610221004040583, "snapped": true } ], "intervention_type": "route", "description": "1", "name": "Union to skatepark" }, "type": "Feature", "id": 2 } ], "authority": "West Yorkshire Combined Authority", "origin": "atip-v1" } ```
Robinlovelace commented 1 year ago

Heads-up @dabreegster when you get back to a computer, the commit above adds example data and code that generates this simple example that I think answers your question in the original post:

image

Robinlovelace commented 1 year ago

Incidentally, looking at the help, impressive already at this nascent stage:

Usage: cli [OPTIONS] <FILE>

Arguments:
  <FILE>  GeoJSON input with LineStrings

Options:
  -o, --output <output>      Write GeoJSON output here [default: output.geojson]
  -s, --summary <summary>    Print a summary of the input and output, summing the one specified numeric property
      --keep_any <keep_any>  Copy the value of this property from any input feature containing it.
      --sum <sum>            Sum this property as a floating point.
      --min <min>            Minimum of this property as a floating point.
      --max <max>            Maximum of this property as a floating point.
      --mean <mean>          Mean (average) of this property as a floating point.
  -h, --help                 Print help
Robinlovelace commented 1 year ago

Reproducible example just tested showing that the Rust version does not currently handle that scenario well, as far as I can tell the outputs are all zero despite the inputs being 1 to 3:

# Test with Rust version:
> command = "cargo run --manifest-path rust/Cargo.toml test-data/crossing-routes-minimal-leeds.geojson -o test-data/crossing-routes-minimal-leeds-output-rust.geojson --sum description"
> system(command)
    Finished dev [unoptimized + debuginfo] target(s) in 0.06s
     Running `rust/target/debug/cli test-data/crossing-routes-minimal-leeds.geojson -o test-data/crossing-routes-minimal-leeds-output-rust.geojson --sum description`
Reading and deserializing test-data/crossing-routes-minimal-leeds.geojson
... took 580.357µs
Running overline on 2 line-strings
... took 2.715326ms
Aggregating properties on 5 grouped line-strings
... took 168.53µs
Writing to test-data/crossing-routes-minimal-leeds-output-rust.geojson
... took 864.542µs
> rnet_rust = sf::read_sf("test-data/crossing-routes-minimal-leeds-output-rust.geojson")
> tm_shape(rnet_rust) + tm_lines(col = "description", palette = "Blues", scale = 9, breaks = 1:5, as.count = TRUE) + tm_layout(scale = 3)
Warning message:
Values have found that are less than the lowest break 
> summary(rnet_rust)
  description          geometry
 Min.   :0    LINESTRING   :5  
 1st Qu.:0    epsg:4326    :0  
 Median :0    +proj=long...:0  
 Mean   :0                     
 3rd Qu.:0                     
 Max.   :0                     
> # Test with Rust version:
> command = "cargo run --manifest-path rust/Cargo.toml test-data/crossing-routes-minimal-leeds.geojson -o test-data/crossing-routes-minimal-leeds-output-rust.geojson --sum description"
> system(command)
    Finished dev [unoptimized + debuginfo] target(s) in 0.06s
     Running `rust/target/debug/cli test-data/crossing-routes-minimal-leeds.geojson -o test-data/crossing-routes-minimal-leeds-output-rust.geojson --sum description`
Reading and deserializing test-data/crossing-routes-minimal-leeds.geojson
... took 544.015µs
Running overline on 2 line-strings
... took 3.309723ms
Aggregating properties on 5 grouped line-strings
... took 209.01µs
Writing to test-data/crossing-routes-minimal-leeds-output-rust.geojson
... took 1.058685ms
> rnet_rust = sf::read_sf("test-data/crossing-routes-minimal-leeds-output-rust.geojson")
> summary(rnet_rust)
  description          geometry
 Min.   :0    LINESTRING   :5  
 1st Qu.:0    epsg:4326    :0  
 Median :0    +proj=long...:0  
 Mean   :0                     
 3rd Qu.:0                     
 Max.   :0                     
> tm_shape(rnet_rust) + tm_lines(col = "description", palette = "Blues", scale = 9, breaks = 1:5, as.count = TRUE) + tm_layout(scale = 3)
Warning message:
Values have found that are less than the lowest break 

Will commit that test asap.

Robinlovelace commented 1 year ago

Correction: the Rust version seems to do fine in this example. Just realised though that this test is not the same as the test you mentioned because both input datasets have vertices there. I'm not sure how the R version handles the version you describe. Output showing IMO correct Rust output below and fix in commit below. Reason for issue: "3" instead of 3 in input GeoJSON, which also affected the R version.

image

Robinlovelace commented 1 year ago

Reproducible example of scenario sketched above...

image

Robinlovelace commented 1 year ago

Update: the example you specific above seems to fail for the R version as shown in the fully reproducible example below. As I mentioned I suspect there are some edge cases where the R version does not work as it should and this could be one of them, were the red line should be split at the vertices:

# Test example with 2 lines parallel for some of the way:
library(sf)
#> Linking to GEOS 3.11.1, GDAL 3.4.3, PROJ 8.2.1; sf_use_s2() is TRUE
#> WARNING: different compile-time and runtime versions for GEOS found:
#> Linked against: 3.11.1-CAPI-1.17.1 compiled against: 3.10.2-CAPI-1.16.0
#> It is probably a good idea to reinstall sf, and maybe rgeos and rgdal too
library(stplanr)
line1 = sf::st_linestring(matrix(c(0, 1, 0, 0), ncol = 2))
line2 = sf::st_linestring(matrix(c(
    0.2, -0.1,
    0.2, 0,
    0.8, 0,
    0.8, 0.1
    ), ncol = 2, byrow = TRUE))
plot(line1, col = "red", lwd = 9)
plot(line2, col = "blue", lwd = 5, add = TRUE)

lines_without_shared_vertices = sf::st_sfc(line1, line2)
lines_without_shared_vertices_sf = sf::st_sf(geometry = lines_without_shared_vertices)
lines_without_shared_vertices$value = c(1, 2)
lines_without_shared_vertices_overline = overline(sl = lines_without_shared_vertices_sf, attrib = "value")
#> Error in `[.data.frame`(x, i, j, drop = drop): undefined columns selected

Created on 2023-04-09 with reprex v2.0.2

Robinlovelace commented 1 year ago

Another update: the R version does provide an answer, but the wrong answer, with only 2 lines, some of which overlap, instead of 5:

image

dabreegster commented 1 year ago

Thanks for clarifying and generating test inputs! #9 and #10 will cover this.

Reason for issue: "3" instead of 3 in input GeoJSON, which also affected the R version.

3 and "3" are very different in GeoJSON; the caller should be sure to pass in numeric properties. Relatedly, I notice lots of the geojson files that maybe come from the sf ecosystem have "crs": { "type": "name", "properties": { "name": "urn:ogc:def:crs:OGC:1.3:CRS84" } }. Per https://github.com/chrieke/geojson-invalid-geometry#defined-coordinate-reference-system-crs, this is from an older spec. I think I've seen geojson.io or some other tool actually reject this. Upstream problem worth checking into at some point?