Open gotom22 opened 6 years ago
@robj411
strikes me as closely related to scenarios #27
i think this is best kept separate from scenarios
ok, maybe similar structural approach as in identifying for which "parameters" we would want to understand what kind of uncertainty information...
The complete list of uncertain input parameters is likely going to be large. At the moment we can just assume that we will have uncertain input parameters all over the model, and build dataframes so that this uncertainty can be propagated through the model.
Also note that we will likely have model uncertainties, which can be different from parameters uncertainty. For example, what injury model to use for given data.
So far, we have accommodated uncertainty for the following parameters:
See https://www.overleaf.com/read/mrjtkhffzfzr for details.
So far, we have accommodated uncertainty for the following parameters:
* walk-to-bus time * cycling MMETs * walking MMETs * background PM2.5 * motorcycle distance relative to car * non-travel PA * non-communicable disease background burden * traffic PM2.5 share * injury reporting rate * day-to-week travel scalar * all-cause mortality (PA) * IHD (PA) * cancer (PA) * lung cancer (PA) * stroke (PA) * diabetes (PA) * IHD (AP) * lung cancer (AP) * COPD (AP) * stroke (AP)
See https://www.overleaf.com/read/mrjtkhffzfzr for details.
Very good start!
When compared to published Sao Paulo paper, following uncertainties are not yet included (and it's not certain if these are relevant in current version):
Injury YLD & duration uncertainties: does this relate to the extrapolation of injury fatalities to injury YLL? This is a factor we can model as a variable parameter.
Emissions uncertainty we could incorporate in the emissions factors but I don't think the handling of these has been settled yet.
For injury linearity, two parameters have been defined, in the updated list below.
See https://www.overleaf.com/read/mrjtkhffzfzr for details.
Injury YLD uncertainty have few different elements. One is the extrapolation of number of deaths to YLL (as you point out), but this relates more to fatality side. Also, data for this can be extracted from GBD.
The injury (non-fatal) part is more complicated. First, we need to have number of injuries. This could be total number or divided between mild and serious. From this we then estimate YLD per injury by taking into account that some injuries cause life-long consequences. All this extrapolation are uncertain. However, in the end all depend on how the non-fatal burden of injuries will be estimated in the model.
OK thanks. I will incorporate this in a new issue as it's not currently included anywhere in the code.
There are two new sources of uncertainty that are a little less straightforward to parametrise. One has to do with emissions, and the other the non-travel PA data.
The emissions will be represented by a Dirichlet distribution, and the non-travel PA will have two parameters: one scalar for the non-zero values, as before, and a new parameter that varies the proportion of non-zero values by demographic group, each represented by a Beta distribution.
For each case, I propose we supply a confidence value, between 0 and 1, where 1 represents full confidence and we use the raw data as provided. We interpret a value between 0 and 1 to parametrise a distribution. See pdfs for examples of how these could look.
I'd like to know if this seems like a reasonable approach; if not, what our alternatives are; if so, how we'd like the mapping from confidence to distribution to look.
this seems reasonable to me
For VOI analysis, we have the option to group parameters. That is, we assume that if we were to learn one parameter, we would learn another also, so it makes sense to work out the value in learning both together, rather than one at a time.
For example, I assume we learn the whole emission inventory together, and we learn the four AP DR parameters together for a given disease (i.e., we learn the curve of the disease, which is defined by four parameters).
I list below the options, first those that belong to the whole model, and then those that will be specific to each setting.
Model parameters
Setting-specific parameters
Groups proposed so far
Update
The results for VOI are stored in results/multi_city/.
Simulations, presently using 1024 samples, take ~40 min on 16 cores.
There is sometimes a spurious correlation; this happens particularly for the multivariate emission_inventory with city--scenario combinations that show no change, e.g. Bangalore motorcycle. Perhaps we should omit these calculations entirely.
Still to do:
Map out sensitivity/uncertainty parameters that would we would want to understand (incl. in sensitivity analysis).