cyipt / actdev

ActDev - Active travel provision and potential in planned and proposed development sites
https://actdev.cyipt.bike
7 stars 3 forks source link

Output OD data from ActDev project in JSON schema #29

Closed Robinlovelace closed 3 years ago

Robinlovelace commented 3 years ago

Defined here: https://a-b-street.github.io/docs/dev/formats/scenarios.html#example

Split out from #28.

dabreegster commented 3 years ago

I'm very amenable to updating this format too. For instance, if you're generating demographic data about the people that might describe their routing preferences (quiet cycle routes vs want adrenaline and steep hills), we can make room to specify things like this.

Robinlovelace commented 3 years ago

Great. Thinking out loud, this could also be a candidate for an R wrapper around a Rust crate. Do you have code that can go from something like this (assuming the origin and destination coordinates are correct and accurate to buildings) https://github.com/cyipt/actdev/blob/main/data-small/study_area_trumpington-test-od.csv to the schema? Fun experiment: benchmark an R implementation e.g. using jsonlite, jsonify and/or (fastest of all) RcppSimdJson vs the rust implementation and try wrapping the Rust code with the rextendr package (definitely a stretch goal!).

dabreegster commented 3 years ago

I'm not sure I understand what we would be benchmarking. The OD csv data linked is aggregated, while this JSON format is disaggregated; they represent pretty different things, and there are lots of arbitrary decisions (like distributing people from each geocode to buildings, and picking departure times) that're necessary to make to convert between the two.

Are we doing this disaggregation in Rust or R? The brunt of the effort isn't code/language choice, but modelling. I don't have much knowledge of how to do it well. We started https://github.com/dabreegster/abstreet/tree/master/popdat/src (described in https://github.com/dabreegster/abstreet/issues/424) that can be partly reused here, if we're going to try in Rust.

Robinlovelace commented 3 years ago

I'm not sure I understand what we would be benchmarking.

I meant comparing the csv > json conversion process in R vs Rust as a stretch/fun aim. If there is already existing code in Rust we can use that as a starting point otherwise happy to make a start on the R side.

Are we doing this disaggregation in Rust or R?

I'm up for giving it a go in R as a starter for 10, it's a good language for prototyping and modelling. Could eventually be ported into Rust.

dabreegster commented 3 years ago

I meant comparing the csv > json conversion process in R vs Rust as a stretch/fun aim

Ah, just a parsing benchmark? Rust uses https://serde.rs/ to do this in a well-typed way (validating the schema as it parses). I don't think any of our datasets are large enough to make for an interesting benchmark, though.

I'm up for giving it a go in R as a starter for 10

Sounds good. I'll focus on improving the map quality / cycleway modelling from the abst side, then.

dabreegster commented 3 years ago

Do we still need to do this in R? https://github.com/dabreegster/abstreet/blob/7797d17ff0ac0b907e1ef5b6dcd59f9c632fa565/importer/src/actdev.rs#L20 is the Rust code that reads in the disaggregated lines, then creates individual people that each take 2 trips (home->work->home) at made-up times (https://github.com/dabreegster/abstreet/blob/7797d17ff0ac0b907e1ef5b6dcd59f9c632fa565/importer/src/actdev.rs#L95)

Robinlovelace commented 3 years ago

No but it could make it easier for people who know R but not Rust to plug+play their own scenarios. Marking as low priority. Out of interest how would I run that from the system command line (bash)?

dabreegster commented 3 years ago

It'd be reasonable to do the whole thing in R, especially if you want more control over things like departure time without touching the Rust. To run this:

./import.sh --scenario --city=cambridge Detailed instructions at https://dabreegster.github.io/abstreet/dev/index.html#building-map-data.

The way the importer splits tasks into --raw, --map, and --scenario is starting to not fit all the different types of importing. I'm thinking about a different way to specify what tasks in the pipeline are run, but for now, that's the command.

dabreegster commented 3 years ago

@Robinlovelace, probably time to make this work for any study area, right? I could make an attempt to generalize https://github.com/cyipt/actdev/blob/main/code/tests/disaggregate.R, but it's going to be a slow learning process, of course.

Robinlovelace commented 3 years ago

@Robinlovelace, probably time to make this work for any study area, right?

Correct. Will aim to make it work at the person level later in the week - unrealistic that all agents from the same house go to the same destination currently but good enough as a starter for 10. Another one to talk over probably.

dabreegster commented 3 years ago

I was trying in https://github.com/cyipt/actdev/pull/59 to get the script to run at all on my system, so I could parameterize it by the study area. But I think the version of the od package has had an incompatible change; I'm getting an error

Error in select(., car_driver, bicycle, foot, car_commute_godutch, bicycle_commute_godutch, : object 'desire_lines_disag' not found

that I can't figure out.

Robinlovelace commented 3 years ago

I'll take a look... Just about to run our build script for these 5 new regions:

[1] "didcot"           "long-marston"     "taunton-firepool"
[4] "allerton-bywater" "handforth" 
Robinlovelace commented 3 years ago

Looking at this:

I was trying in #59 to get the script to run at all on my system, so I could parameterize it by the study area. But I think the version of the od package has had an incompatible change; I'm getting an error

I think it's nearly fixed...

Robinlovelace commented 3 years ago

Heads-up @dabreegster I'm not sure if this is 100% right but think it's close - comments pls:

``` { "scenario_name": ["baseline"], "people": [ { "origin": { "Position": { "longitude": -2.6367, "latitude": 53.4006 } }, "trips": [ { "departure": 30884, "destination": { "Position": { "longitude": -2.6228, "latitude": 53.4166 } }, "mode": "drive" } ] }, { "origin": { "Position": { "longitude": -2.6245, "latitude": 53.3959 } }, "trips": [ { "departure": 30661, "destination": { "Position": { "longitude": -2.6317, "latitude": 53.4171 } }, "mode": "walk" } ] }, { "origin": { "Position": { "longitude": -2.6286, "latitude": 53.3971 } }, "trips": [ { "departure": 27822, "destination": { "Position": { "longitude": -2.6364, "latitude": 53.4139 } }, "mode": "drive" } ] } ] } ```
Robinlovelace commented 3 years ago

See here for the data: https://github.com/cyipt/actdev/releases/download/0.1.3/desire_line_out_test_3.json

dabreegster commented 3 years ago

Copying from slack:

One small error, caught by trying to import: the scenario_name is just a string, not ["baseline"]

And mode has to be one of Walk, Bike, Transit, Drive (for "other", we could map to transit, or exclude, or map to Drive if they're a passenger, for now)

(I'm finding these by running the importer, cargo run --bin import_traffic -- --map=data/system/cheshire/maps/chapelford.bin --input=/home/dabreegster/Downloads/desire_line_out_test.json)

Robinlovelace commented 3 years ago

Done, with reproducible results in a new prototype package for modularity, reproducibility and scalability: https://cyipt.github.io/abstr/

This could eventually move into the https://github.com/a-b-street/ org