ITHIM / ITHIM-R

Development of the ITHIM-R, also known as ITHIM version 3.0. Started in January 2018.
https://ithim.github.io/ITHIM-R/
GNU General Public License v3.0
17 stars 11 forks source link

Scenarios #27

Open JDWoodcock opened 6 years ago

JDWoodcock commented 6 years ago

to agree approaches to scenarios 1) trip, time, distance based depending on data 2) longer term- based on changing urban form (and consideration of move to more geographically explicit model) 3) time horizon- changes to disease burdens, emission factors, injury risks etc over time; discount rates; trajectories of uptake

gotom22 commented 6 years ago

strikes me as a top priority to clarify/sort out.

I would suggest to develop some "spread sheet/list with some sorting order" by which the various scenario options are being described/categorized. This could include columns like:

Could we aim to

  1. as the first steps:

    • clarify in which format to develop this (spreadsheet, lists, charts...)
    • define some terminology (like the columnheaders/topics I suggest above)
  2. brainstorm ideas (i.e. populate 1.) (just looking at earlier ITHIM use case should provide plenty)

  3. sort out, polish and prioritize

  4. integrate with data reqs, coding reqs, shiny, flow chart etc.

gotom22 commented 6 years ago

@JDWoodcock @nmaizlish

...my plan to hack "scenario parameters" directly into table format was not too successful....it seems a bit tricky to nail the relevant columns right away: https://docs.google.com/spreadsheets/d/1Bzq_6iuQ6N1eD2dLkQrTVb6ffEEZSK6TQ5ZOOQ34ZRQ/edit#gid=847742393

So I opted for figuring this out first in the flow chart (surprise;-) . Not so straight forward either....but it may be helpful at some point... ITHIM-R_v4_scenario definition.pdf

Basically I am trying to get an overview of which "parameters" users would define in their "scenarios", and which variables would be affected downstream. The draft is in ITHIM-R_v4.graphml (folder flowcharts). It is certainly not complete, and not sure my structure for the main domains "trips", "traffic/fleet/(all trips)", "individuals", "environment", "population" makes the most sense....but we'll see.

What would be helpful are more "scenario suggestions" or themes... and also very specific examples.

gotom22 commented 6 years ago

from Neil

I have a hard time going beyond, the “what” for scenario development, rather than the “why”, which I have regarded as external to the ITHIM model and be part of some other model, which would would export its outputs as ITHIM inputs. Under a reasonable number of “what ifs”, we can specify arbitrary changes in walking, cycling, and transit use, and car-mile substitution. We can also potentially build in substituting for a specific percentage of car trips < 1, 3, 5 miles (or kilometers), if users can provide data on the distances of car trips of <1, 3, 5 mile length (which is derived from household travel surveys). I leave it up to James’ team to comment on propensity to cycle if the number/percentage of destinations were increased within radii. Also, do we want to include “equity scenarios”.

Once we get into changes in land use, creation of infrastructure, or impact of specific (big) projects, some kind of microsimulation would be required. The only exception could be a type of SRS approach (4th IPCC), where specific futures are described with a set of land-use, electrification in transportation, infrastructure changes, levels of investments, etc. and made rough estimates of what the changes would be in travel behavior. Perhaps hybridizing the best of different countries experience might inform this approach. Still sounds like This is a huge task for which lots of questions would be raised. Neil.

gotom22 commented 6 years ago

From James

I put a few initial thoughts up there on what could be on the interface following Neil’s earlier ppt. I put it in the same IO interface folder.

For now I stick with ppt.

ITHIM R.pptx (for editing use version in Gdrive Folder ITHIM-R/I-O Interface/ITHIM R.pptx)

gotom22 commented 6 years ago

@JDWoodcock some feedback on the ppt, and suggestions on to make it more actionable.

...ok, that's a lot obviously;-) can grow organically as you see worthwhile.

gotom22 commented 6 years ago

I have tried to layout and identify "most likely modifiable scenario parameters" (yellow) and how they relate to other variables and impacts.... I think we will eventually have to distinguish "scenario parameters" (very content focused) and "calculation parameters/assumptions" (various other ways to tweak the math)

ITHIM-R_v5_scenario definition.pdf (you may have to go to flow charts folder for full res version, not sure. This is exported from ITHIM-R_v5.graphml)

(work in progress. a few details still to adopt from James' slides)

gotom22 commented 6 years ago

Tasks I see on the "flow chart view" for scenarios:

gotom22 commented 6 years ago

Task for the "ppt interface visualization"

JDWoodcock commented 6 years ago

@gotom22 Task for the "ppt interface visualization" draft a "Scenario definition page(s)" that reflects the flow chart elements. - What are you thinking of here?

gotom22 commented 6 years ago

I think each yellow box in the pdf needs some representation in the interface (number field, slider or similar)


Thomas Götschi

Am 13.03.2018 um 14:43 schrieb James Woodcock notifications@github.com:

@gotom22 Task for the "ppt interface visualization" draft a "Scenario definition page(s)" that reflects the flow chart elements. - What are you thinking of here?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

markotainio commented 6 years ago

I would focus at this phase to mode sift scenarios, where mode sift is based either to user defined changes or propensity generated changes. Land use scenarios would be done later on.

For scenario tool we have two examples from past:

  1. For user input scenarios so called trip based version of ITHIM, developed in Analytica, where user could define what % of trips from different modes are transformed to cycling. See illustration of that below. tripbasedithim

In this example, 50% of all car/taxi and motorcycle trips 0-5 km long are transferred for cycling, and 50% of all public transport (bus, underground, national rail) trips 0-3 km long are transferred for walking.

This isn't a very good example of user interface, but it indicates one example on how scenarios could be done.

  1. Example of propensity driven scenarios is in ICT (http://www.pct.bike/ict/). This could be replicated in ITHIM-R just like it is in ICT tool.
nmaizlish commented 6 years ago

James, thanks so much for working out the wire frame in more detail on the PowerPoint slides. We should discuss the I/O organizing principle. Are you are suggesting "new" vs "established" implementation? I suggest an alternative "scenario-place-time" approach. Even though the content is equivalent, ease of navigation and user understanding may be different. Also, from the standpoint of design, the data processing steps for travel surveys, etc. could be part of the direct navigation pathway (which you are suggesting?) or performed from a separate "utilities" tab.

There are some important nuances of scenario creation that I think Thomas picked up on regarding 1) imported, pre-processed data defining a scenario in terms of distance/time by mode, and 2) calculations internal to ITHIM-R performed on calibration or BAU data triggered by user selectable options (no additional data beyond that of calibration data).

Looking back over our work with the aggregate ITHIM for the last 8 years, I have catalogued the ways we have contextualized and created scenarios (more than I realized). They fall into one of the two functional categories, although the contexts or use cases may be different. I put together a spreadsheet posted on the GoogleDrive that describes these. A talkier version of this could be incorporated into one of interface web pages to give ideas and guidance to users.

JDWoodcock commented 6 years ago

thanks @nmaizlish very helpful, Ive incorporated a few ideas from spreadsheet into ppt; please add to that. I dont understand your point "are you are suggesting "new" vs "established" implementation? I suggest an alternative "scenario-place-time" approach."?

gotom22 commented 6 years ago

UPDATE: in main project structure we now distinguish 2 ways to get to "Scenarios":

Scenario Definition (Creation of "Synthetic counterfactual data") (2 options:)

1 Manipulation of baseline data through a Scenario definition module (interface or code). 2 Creation of Synthetic scenario data, based on local data sources, analogous to steps for baseline data above.

In particular for approach 1 we should take some time to synthesize above discussion and inputs. I will add to #16 , but likely will warrant separate meeting/call.

nmaizlish commented 6 years ago

Hi, this seem like the logical spot to insert the Salon et al approach to constructing scenarios by examining potential shifts in the distribution of place types (urban form) and their impact on active travel. Attached is the paper Debra Salon JTRG 2016.pdf wrote describing how to do this using widely available US data sources. This may factor into the small area estimation that Sam is contemplating for a US County-based version of ITHIM.

JDWoodcock commented 5 years ago

@robj411 @usr110 @markotainio @rahulatiitd @leandromtg for tigthat case cities model I make the following suggestion after Rob and I had a quick discussion about scenarios yesterday. Suggestion 1) I suggested we use a propensity approach such that for each city we increase one mode (and decrease the rest) to have the following scenarios 1) Highest propensity to walk from our set of cities 2) Highest propensity to cycle from our set of cities 3) Highest propensity to drive/ taxi from our set of cities 4) Highest propensity to motorbike from our set of cities 5) Highest propensity to bus (train) from our set of cities

We would assume trips are taken from each other mode equally for a given distance (as we do in PCT)

I’d like people to foresee problems. I will start 1) As we add more cities our scenarios will change (potentially a lot if we added e.g. a high cycling city)- unless we could generate propensities before we have done the rest of the city work 2) Motorbike rates vary a lot and injuries might go sky high (potentially unrealistically so) 3) How do we vary propensities by age/ gender ? Prob large age groups. And should we use different cities if men’s propensity is higher one place and women’s another? …

What I see we want from the scenarios is something we can compare across settings but clearly this could mean a lot of things e.g. same absolute levels, same propensity, same % increase, same absolute increase. The scenarios need to fit with VoI. They also need to be explainable

It would be nice to do a BAU scenario but practically I don’t see we have time to do this and it is so assumption laden. It also throws up the problem of what else will change. In Ghana we try and deal with this via the sensitivity analysis and I think that is the best option- other views welcome.

Suggestion 2) My second preference alternative to this would be that we see a relative increase in each mode for each city. Pros: 1) Every city can change in each scenario 2) Might be closer to what could happen sooner for some cities Cons 1) I think it will be difficult to choose a relative (even an odds) that makes sense and is consistent for each city

I don't see we can do urban form changes (see above) but we will revisit for future work. Neil's @nmaizlish https://drive.google.com/file/d/15k9HmS_hylT4xV0aoVqACleZMjX54sRV/view is worth reading

robj411 commented 5 years ago

Here is what Suggestion 1 might look like, with the three cities we have so far, five modes, and an arbitrary choice of distance categories, where we choose the highest based on total trip share.

Total trip share as %, and we choose the highest for each mode:

accra sao_paulo delhi
walking 53.3 35.3 44.4
bicycle 0.5 0.9 4.0
car 9.6 26.1 9.6
motorcycle 0.0 2.7 15.7
bus 30.6 21.3 15.6

& here are those percentages again by distance category. If a city has a mode in bold, it's unchanged in that mode's scenario.

accra 0-1 km 2-5 km 6+ km
walking 93.3 66.4 2.0
bicycle 0.6 0.3 0.4
car 1.0 10.4 18.2
motorcycle 0.0 0.0 0.0
bus 0.0 18.5 71.2
sao_paulo 0-1 km 2-5 km 6+ km
walking 85.0 33.6 0.0
bicycle 0.3 2.0 0.9
car 6.2 39.5 35.7
motorcycle 0.3 3.9 4.0
bus 0.0 19.7 37.4
delhi 0-1 km 2-5 km 6+ km
walking 83.0 29.2 2.0
bicycle 2.1 6.1 5.2
car 2.1 11.0 18.9
motorcycle 5.8 24.0 23.5
bus 1.3 17.0 34.2
JDWoodcock commented 5 years ago

my impression is that this will be interesting and worthwhile - @markotainio could you ask Audrey about problems they encountered with similar scenarios in the past? @robj411 do you think this will work for VoI. @markotainio I agree testing some small consistent (probably relative) increase too would be worthwhile but we should decide which to prioritise.

markotainio commented 5 years ago

my impression is that this will be interesting and worthwhile - @markotainio could you ask Audrey about problems they encountered with similar scenarios in the past?

I checked reviewer comments, and it seems that I misremembered the situation. There was nothing specific from scenarios, except request to describe them more detailed.

robj411 commented 5 years ago

If we use the maximum share for each mode and each distance, we would have:

0-1 km 2-5 km 6+ km
walking 93.3 66.4 2.0
bicycle 2.0 5.6 5.0
car 6.2 39.5 35.7
motorcycle 5.6 22.5 22.3
bus 1.2 21.1 71.2
JDWoodcock commented 5 years ago

some of the values are high but i think this makes sense. If we can get Bangalore in there soon we can see how it looks. @rahulatiitd @leandromtg Is it possible to look at the other cities to include them for scenarios before we implement them fully?

rahulatiitd commented 5 years ago

Definitely can do Bogota rather quickly.. @JDWoodcock @leandromtg See, if it is only travel surveys, than I can do others also.. but then we should only do those which we know for sure will become full case study-- Bogota is one, than Buenos Aires may be..

robj411 commented 5 years ago

multi_city_yll_pp.pdf

What it might look like: some results for the city & distance category maximum propensities. Omitting all cause and injury as they are off the scale. NB lots of parameters are still to be set for Sao Paulo and Delhi so these are in no way final.

JDWoodcock commented 5 years ago

@rahulatiitd yes agree we need final list of cities. Next monday that means we want a presentation on the data we have to make that decision- please help @markotainio get this ready.

JDWoodcock commented 5 years ago

@robj411 i would suggest looking at all cause vs injury too

robj411 commented 5 years ago

multi_city_yll_pp.pdf

What it might look like: some results for the city & distance category maximum propensities. NB lots of parameters are still to be set for Sao Paulo, Delhi and Bangalore so these are in no way final.

rahulatiitd commented 5 years ago

I dont understand these results.. why diseases broken down by modes. Bangalore is jumping out. I am glad this process has started though.

rahulatiitd commented 5 years ago

@robj411 Could you please tell me what you are using as emission inventory. The car scenario giving such high reduction in the emissions for 'bangalore' (the city that is certainly an outlier in IRI) is only possible if you are using Sao Paulo emissions which have awkwardly low emissions factors for cars (ethanol) and makes car scenario very pollution pleasing. I really hope thats the case, otherwise, it just means more investigation.

robj411 commented 5 years ago

The modes correspond to "city & distance category maximum propensities" scenarios. I wouldn't worry about the numbers until the input data have been finalised. Any we can complete/tick off? #49 #52 #53 #55

robj411 commented 5 years ago

@rahulatiitd

         delhi=list(motorcycle=1409,
                                   auto_rickshaw=133,
                                   car=2214,
                                   bus_driver=644,
                                   big_truck=4624,
                                   truck=3337,
                                   van=0,
                                   other=0,
                                   taxi=0),

                        bangalore=list(motorcycle=1757,
                                   auto_rickshaw=220,
                                   car=4173,
                                   bus_driver=1255,
                                   big_truck=4455,
                                   truck=703,
                                   van=0,
                                   other=0,
                                   taxi=0)

Sao Paulo and Accra are

bus=0,
bus_driver=0.82,
car=0.228,
taxi=0.011,
walking=0,
bicycle=0,
motorcycle=0.011,
truck=0.859,
big_truck=0.711,
other=0.082
rahulatiitd commented 5 years ago

Alright, so Bangalore and Delhi are clearly 'transport emissions' with a unit of mass (tonnes/kg) and Sao Paulo and Accra are 'emission factors' with a unit of g/km. By that logic though, the outlying results should have happened for Delhi too. For you to harmonise the approaches across all cities, either convert the Delhi and Bangalore emissions into emission factors by dividing each of those numbers by their respective baseline distances.

rahulatiitd commented 5 years ago

sorry i meant 'tonnes or kg' not tonnes per kg for unit of mass; the latter is still g per km

robj411 commented 5 years ago

Alright, so Bangalore and Delhi are clearly 'transport emissions' with a unit of mass (tonnes/kg) and Sao Paulo and Accra are 'emission factors' with a unit of g/km. By that logic though, the outlying results should have happened for Delhi too. For you to harmonise the approaches across all cities, either convert the Delhi and Bangalore emissions into emission factors by dividing each of those numbers by their respective baseline distances.

The emissions calculation is 'harmonised' as the emissions are relative to one another. This means that the magnitude is not important. Think of it as a scaled inventory. We would get the same results for Sao Paulo and Accra were we to write

Accra:

bus=0, bus_driver=820, car=228, taxi=11, walking=0, bicycle=0, motorcycle=11, truck=859, big_truck=711, other=82

Sao Paulo:

bus=0, bus_driver=82000, car=22800, taxi=1100, walking=0, bicycle=0, motorcycle=1100, truck=85900, big_truck=71100, other=8200

rahulatiitd commented 5 years ago

@markotainio could you please confirm if the Sao Paulo numbers that Rob has listed up are emissions or emission inventory!

rahulatiitd commented 5 years ago

i meant emission factors not inventory

robj411 commented 5 years ago

They are the (scaled) emission inventory of Accra and the default values in ithimr.

rahulatiitd commented 5 years ago

@robj411 we can use emission inventory for Accra as estimated by us during the Accra model. I wonder why it isnt there given that @leandromtg may be using those results

rahulatiitd commented 5 years ago

in your VOI paper, you have used emission inventory for Sao Paulo-- i am copying from the table in your manuscript. Can we explore where this data came from

  Background PM2.5 concentration Lognormal(3,1)
  Fraction of  attributed to road transport Beta(2,3)
  Fraction of #emissions# attributed to four different modes (bus, car, motorcycle, goods vehicles) Dir(32,8,4,56)
JDWoodcock commented 5 years ago

agreed very useful. I propose we agree on monday a set of results (and also table of inputs) that @rahulatiitd @markotainio and I can check over and use to spot errors. For inputs we would want mode share, mean time per mode, simple who hit whom table, pollution levels, source apportionment, what else?

robj411 commented 5 years ago

@rahulatiitd

accra 0-1 km 2-5 km 6+ km
walking 96.4 66.4 2.0
bicycle 0.6 0.3 0.4
car 1.0 10.4 18.2
motorcycle 0.0 0.0 0.0
bus 0.0 18.5 71.2
sao_paulo 0-1 km 2-5 km 6+ km
walking 85.0 33.6 0.0
bicycle 0.3 2.0 0.9
car 6.2 39.5 35.7
motorcycle 0.3 3.9 4.0
bus 0.0 19.7 37.4
delhi 0-1 km 2-5 km 6+ km
walking 83.5 27.8 1.9
bicycle 2.0 5.6 5.0
car 2.0 10.3 18.1
motorcycle 5.6 22.5 22.3
bus 1.2 21.1 35.9
bangalore 0-1 km 2-5 km 6+ km
walking 85.3 24.3 0.6
bicycle 2.7 9.7 0.8
car 1.1 4.1 5.4
motorcycle 5.3 34.7 56.6
bus 4.0 22.8 34.0
rahulatiitd commented 5 years ago

@robj411 this is so good, so helpful and exactly the kind of diagnostic tables we need! Thank you so much

JDWoodcock commented 5 years ago

thanks for Accra we should be including the added motorcycle trips?

rahulatiitd commented 5 years ago

@robj411 for each city

rahulatiitd commented 5 years ago

@robj411 should each of these column not sum to 100% for each city? they do not for now

robj411 commented 5 years ago

thanks for Accra we should be including the added motorcycle trips?

That would seem like a strange thing to do as they weren't recorded in the travel survey

robj411 commented 5 years ago

@robj411 should each of these column not sum to 100% for each city? they do not for now

There are modes other than walking, bus, car, motorcycle and bicycle that will account of the shortfall

JDWoodcock commented 5 years ago

If Ghana comes out as top for motorcycling there is a problem. But you could put 'missing' rather than 0%

rahulatiitd commented 5 years ago

@robj411 Leandro, James and I were looking at these tables, and are concerned about 'other' modes which may bring the total to 100%. 10% missing in 0-1 km when all the obvious modes are already there, is not making sense.. may be we can do this better when you are here

robj411 commented 5 years ago

Okay, check the datasets in inst/extdata/local/CITY/trips_CITY.csv by e.g. selecting the subset of trips with trip_distance < 1.

To reproduce or change tables, see R/get_scenario_settings.R, or call get_scenario_settings().

rahulatiitd commented 5 years ago

@robj411 I am sure you took care of that, but just want to make sure that trip IDs are repeated across the rows, as multiple stages may be there. I hope you did not double count the trips in this case?