fredshone commented 2 years ago

this is a monster (sorry about the merge Andrew):

adds new trip durations bm trip_durations_comparison
renames euclidean_distance_comparison to euclidean_distance_breakdown_comparison
renames duration_comparison to duration_breakdown_comparison
fixes some bugs with the link_vehicle_speeds_comparison
adds scripts/example_configs_smoke_tests.sh to the build (this elara runs all confiogs found in example_configs)
to faciltate the above there is a new option when "running" elara called output_directory_override, but this is only exposed for smoke tests (so that they can write to a temporary directory) so not documented
updates docs

EDIT

removes alll test fixture benchmarks from the benchmarking module and cli interface

Trip durations comparisons (aka google trips queries):

Configured via the config as follows:

[benchmarks]
trip_durations_comparison = {benchmark_data_path = "./PATH.csv"}

The expected format for the new benchmark data (csv) is:

agent,seq,duration_s
chris,1,454.0
chris,2,463.0
nick,1,4.0
nick,2,454.0
other,1,1000

Where agent is the agent id and seq the trip sequence. Duration is in seconds!

I also wanted to support the case where we had colleccted data about trip durations but some agents may have changed modes in sim. If we wanted to only compare agent trips that still use the same mode, we can do as follows:

[benchmarks]
trip_durations_comparison--mode_consistent = {benchmark_data_path = "./PATH.csv", mode_consistent = true}

In which case we should additionally provide a mode column:

agent,seq,mode,duration_s
chris,1,car,454.0
chris,2,car,463.0
nick,1,car,4.0
nick,2,car,454.0
other,1,car,1000

This ensures, for example, that if chris has shifted to bike mode for seq 1, that this comparison is ignored. The default for mode_consistent is false.

I am intending to add standalone docs for the benchmarks in a future PR

andkay commented 2 years ago

Thanks @fredshone -- I'm approving, but have also left a couple thoughts on adding some fail safes for cases where people supply bad values for plot_type.

fredshone commented 2 years ago

I've cleaned up test fixtures.

At this time I am not in favour of being strict about the benchmark kwargs that get passed. There will obviously be room for miss-spelling of keys and bad values, but i consider the benchmark module to be WIP and don't want to engineer anything at this point. One bit of good news is that kwargs that aren't specifically caught will be appended to output file names. So will help with debugging.

arup-group / elara

adds trip duration comparison #166

Trip durations comparisons (aka google trips queries):