separate intervention parameters from sampling parameters

ManuelaRunge commented 4 years ago

to be able to run varying intervention scenarios within the same simulation experiment and using the same samples for the scenarios (i.e. social_multiplier defined in a simiilar way as Ki )

ManuelaRunge commented 4 years ago

might be related to the issue of allowing to run multiple parameter for Ki instead of Ki only: https://app.zenhub.com/workspaces/covid-19-modeling-5e8b85bb090561fddfdc48ec/issues/numalariamodeling/covid-chicago/183

ManuelaRunge commented 4 years ago

Hi @jacksonllee
I think it would be good to write the intervention parameters in a separate yaml file, and to allow having multiple of these if needed

To recap, the simulations are build from (combinations of) :

fixed_parameters_region_specific
fitting_parameter
sample_parameter
intervention_parameter (effect_param + time_param) with time_param varying per startdate

maybe a --intervention_config intervention.yaml could be added in the python runScenarios.py command:

python runScenarios.py 
--running_location Local 
--region EMS_1 
--experiment_config EMSspecific_sample_parameters.yaml 
--intervention_config intervention.yaml 
--emodl_template extendedmodel_cobey_interventionStop.emodl  
--name_suffix "scenario1_noTravel"

I think the overall structure and age yaml file as well as spatial-age yaml structure would stay the same, it would 'just' insert a different source file for additional parameter to be replaced.... ? Then the Kivalue for loop could stay as it is, and the Ki ticket can probably be closed.

jacksonllee commented 4 years ago

Is there a reason why the intervention parameters cannot stay in the same yaml (e.g., extendedcobey_200428.yaml) with all other parameters?
Having intervention params in multiple files would seem to be hard to manage. What do you think having a single top-level key like intervention_parameters (i.e., what #301 is already implementing), and then under this key we can have an array of objects (as opposed just one object)? To be concrete,

from:

intervention_parameters:
  # Here there's only one single object, with keys "social_multiplier_1", "social_multiplier_1", etc.
  'social_multiplier_1':
    np.random: uniform
    function_kwargs: {'low':0.7, 'high':0.9}
  'social_multiplier_2':
    np.random: uniform
    function_kwargs: {'low':0.2, 'high':0.5}
  ...

to:

intervention_parameters:
  # Here we have an array of objects, each signaled by a dash in yaml.

  # This is the first object.
  - 'social_multiplier_1':
       np.random: uniform
       function_kwargs: {'low':0.7, 'high':0.9}
    'social_multiplier_2':
      np.random: uniform
      function_kwargs: {'low':0.2, 'high':0.5}

  # This is the second object.
  - 'social_multiplier_1':
       np.random: uniform
       function_kwargs: {'low':0.7, 'high':0.9}
    'social_multiplier_2':
      np.random: uniform
      function_kwargs: {'low':0.2, 'high':0.5}
  ...

ManuelaRunge commented 4 years ago

hmm possibly yes.

I think we could either run separate simulations (but I would like not to have to make changes in the yaml file each time, hence store the options beforehand in the yaml or different yamls) or run one simulation with all the options included in it ( additional instance in the full factorial combinations. ).

Below two examples

instead of running (scenario EMSgrp_interventionSTOPadj10)

#edit in yaml
  'backtonormal_multiplier':
    np.random: uniform
    function_kwargs: {'low':0.10, 'high':0.10}

python runScenarios.py --running_location Local--region IL --experiment_config spatial_EMS_experiment.yaml --emodl_template extendedmodel_EMS_grp_interventionSTOPadj.emodl --cfg_template model_B.cfg --name_suffix "EMSgrp_interventionSTOPadj10"

changing 10% to 30% in yaml file (scenario EMSgrp_interventionSTOPadj30)

# edit in yaml
  'backtonormal_multiplier':
    np.random: uniform
    function_kwargs: {'low':0.30, 'high':0.30}

python runScenarios.py --running_location Local--region IL --experiment_config spatial_EMS_experiment.yaml --emodl_template extendedmodel_EMS_grp_interventionSTOPadj.emodl --cfg_template model_B.cfg --name_suffix "EMSgrp_interventionSTOPadj30"

I would like to have something like

  'backtonormal_multiplier':
    np.random: uniform
    function_kwargs: {'low':0.10, 'high':0.30}

and a full factorial for Kivalues , sampling parameter and i.e. the backtonormal_multiplier (included in the intervention parameter) with the option to draw values for the intervention parameter independently from the sampling parameter

When running contact tracing simulations I do:

python runScenarios.py --running_location Local --region EMS_11 --experiment_config EMSspecific_sample_parameters.yaml --emodl_template extendedmodel_cobey_testDelay_contactTracing.emodl --name_suffix "contactTracing_testDelay1"
python runScenarios.py --running_location Local --region EMS_11 --experiment_config EMSspecific_sample_parameters.yaml --emodl_template extendedmodel_cobey_testDelay_contactTracing.emodl --name_suffix "contactTracing_testDelay2"
python runScenarios.py --running_location Local --region EMS_11 --experiment_config EMSspecific_sample_parameters.yaml --emodl_template extendedmodel_cobey_testDelay_contactTracing.emodl --name_suffix "contactTracing_testDelay3"

And edit in between submissions the yaml file. (although testDelay is not an intervention parameter, but I could move it there)

Does that make sense, or would it require too much of an change in the exciting structure too much ? Should we schedule another call for this ?

jacksonllee commented 4 years ago

@ManuelaRunge If you'd like multiple values rather than just one value for a given intervention parameter (e.g., backtonormal_multiplier), since the code is already written in a such way to use numpy's np.random.uniform with the given kwargs, you could simply add the size kwarg to get multiple values.

For example, changing from this (which calls `np.random: uniform(low=0.1, high=0.3), returning one single scalar)

  'backtonormal_multiplier':
    np.random: uniform
    function_kwargs: {'low':0.10, 'high':0.30}

to this (which calls np.random: uniform(low=0.1, high=0.3, size=5), returning an array of 5 numbers)

  'backtonormal_multiplier':
    np.random: uniform
    function_kwargs: {'low':0.10, 'high':0.30, 'size': 5}

Would this input format work for you? I see the code wouldn't work as-is, because the size kwarg is already used for another purpose (https://github.com/numalariamodeling/covid-chicago/blob/9bfbc8822697ee56525348f6e275aaf04d6179b3/runScenarios.py#L32), but we could change that. Let me know if this approach (using the size kwarg in the yaml) would get the multiple intervention parameter values you'd like.

ManuelaRunge commented 4 years ago

Hi @jacksonllee , yes multiple values are possible with the

 'backtonormal_multiplier':
    np.random: uniform
    function_kwargs: {'low':0.10, 'high':0.30}

and as you mention, the size parameter is already used by the runScenarios.py to have as many values as samples.

I think having the size parameter in the yaml (as for the Kivalues), that would be a good idea, then we could specify the number of parameter values for each effect size parameter (and ideally also for the time_parameter which might be more complex) that are repeated for each sample, startdate and Ki...

jacksonllee commented 4 years ago

@ManuelaRunge Yup, I think it's all doable. For time parameters, currently the yaml looks like this which allows only a single date (e.g., 2020-03-12 here):

  'socialDistance_time1':
    custom_function: DateToTimestep
    function_kwargs: {'dates': 2020-03-12, 'startdate_col': 'startdate'}

To allow multiple dates, what do you think about a format like this which allows multiple dates (e.g., 2020-03-12, 2020-03-13, and 2020-03-14 here):

  'socialDistance_time1':
    custom_function: DateToTimestep
    function_kwargs: {'dates': [2020-03-12, 2020-03-13, 2020-03-14]}

?

(I'd remove the 'startdate_col': 'startdate' bit, because the start date column name is always 'startate' anyway?)

ManuelaRunge commented 4 years ago

that looks good!

just as an additional comment, having different dates as in function_kwargs: {'dates': [2020-03-12, 2020-03-13, 2020-03-14]} is different as defining 'socialDistance_time1', 'socialDistance_time2', 'socialDistance_time3' etc, the numbering refers to the number/type of time events and the multiple dates for uncertainty or varying scenarios when that time-event will happen. So what you suggest in your comment looks as what we need!

numalariamodeling / covid-chicago

separate intervention parameters from sampling parameters #266