Closed joeytalbot closed 6 months ago
For confirmation, in the commute and school data, each row has both a unique route_id and a unique route_number:
> tar_load(r_commute_fastest)
> dim(r_commute_fastest)
[1] 433255 18
> length(unique(r_commute_fastest$route_number))
[1] 433255
> length(unique(r_commute_fastest$route_id))
[1] 433255
> tar_load(r_school_fastest)
> dim(r_school_fastest)
[1] 55975 18
> length(unique(r_school_fastest$route_number))
[1] 55975
> length(unique(r_school_fastest$route_id))
[1] 55975
route_id
is used as a grouping variable in a large number of functions. We could either change all of these to use route_number
instead, or we could reset route_id
so it is unique for every row.
Any preferences? @mem48 @Robinlovelace
This is fixed in #377
In the utility trips, some route_id are duplicated.
I think this is because the route_id are generated separately for shopping, visiting and leisure trips, and when these are combined some happen to be identical.
I'm not sure why we need to create both route_id and route_number, but it might be better to use route_number instead as a grouping variable.