Closed khaeru closed 3 years ago
As part of a Google Drive spreadsheet shared by @hlinero / co-developed with @soniayeh, the following checks were defined:
ID | Dividend | Divisor | Resulting variable |
---|---|---|---|
A001 | T000 Passenger Activity | 10^9 passenger-km / yr | Passenger | Road | LDV | T000 Passenger Activity | 10^9 passenger-km / yr | Passenger | ALL | ALL | iTEM | Passenger Activity | % in total inland passenger-km / yr | Passenger | Road | LDV |
A002 | T000 Passenger Activity | 10^9 passenger-km / yr | Passenger | Road | LDV | T008 Stock | 10^6 vehicle | Passenger | Road | LDV | iTEM Passenger vehicle Activity | 10^3 passenger-km/vehicle | Passenger | Road | LDV |
A003 | T003 Freight Activity | 10^9 tonne-km / yr | Freight | Road | All | T010 Stock | 10^6 vehicle | Freight | Road | All | iTEM Freight vehicle Activity |10^3 tonne-km / vehicle | Freight | Road | All |
Draft IPython notebooks were also shared. I will open a new PR to introduce A003 into the code.
As suggested by Pierpaolo Cazzola on the 2019-11-15 call, we need diagnostic calculations that run every time the historical database is regenerated, giving statistics that are helpful in diagnosing data quality issues: