nismod / smif

Simulation Modelling Integration Framework
http://www.itrc.org.uk
MIT License
22 stars 6 forks source link

Add `smif prepare` command line argument for variability #364

Closed willu47 closed 5 years ago

willu47 commented 5 years ago
tlestang commented 5 years ago

Within the variability branch, the smif command line interface now has a prepare option. smif prepare takes three arguments:

  1. A model run name (Ex: energy_central)
  2. A scenario name (Ex: weather_at_home)
  3. Two integers indicating the lower and upper bound of the variants range (Ex: 3 6)

What does smif prepare energy_central weather_at_home 3 6 do?

  1. It looks in config/scenarios/ for a template file weather_at_home_template.yml and generate a new weather_at_home.yml containing variants replicate 3, replicate 4,...,replicate 6.
  2. It reads config/model_runs/energy_central.yml and generate energy_central_3.yml,energy_central_4.yml,...,energy_central_6.yml. For each of these the scenario variant for weather_at_home in the original energy_central.yml is replace by the corresponding replicate.

The scenario template weather_at_home_template.yml must be provided as follows

$ cat weather_at_home_template.yml
name: weather_at_home
description: UK weather
provides:
- name: maximum_temperature
 ...

variants:
- name: my_dummy_variant
  description: ''
  data:
    maximum_temperature: t_max__NF.csv
    minimum_temperature: t_min__NF.csv
    average_radiation: rsds__NF.csv
    average_wind_speed: wss__NF.csv

In the resulting weather_at_home.yml, variant x will be defined by


...
- name: replicate_x
  description: ''
  data:
  maximum_temperature: t_max__NFx.csv
  minimum_temperature: t_min__NFx.csv
  average_radiation: rsds__NFx.csv
  average_wind_speed: wss__NFx.csv
...
tlestang commented 5 years ago

At the moment, the scripts of issues #362 and #363 (write_scenario_variants.py and write_variant_model_runs_from_template.py) have been turned into functions in smif/cli/__init__.py

TODOS

willu47 commented 5 years ago

Hi @tlestang - I was able to use the functionality you described and the sample project.

I think we should split the functionality into two separate pieces:

  1. generate the scenario variants in the config file
  2. generate the model runs from a template model run referencing a scenario with multiple variants

Note that the latter would also provide us with the ability to programmatically generate all model runs for the variants in a scenario - that's a bonus, even if the source scenario isn't one that is inherently about variability.

tlestang commented 5 years ago

Hi @willu47 and @tomalrussell , I split the smif prepare command into two commands: smif prepare-scenario and smif prepare-run. Examples:

$ cat /config/scenarios/weather_at_home_template.yml
...
variants:
- name: my_dummy_variant
  description: ''"
  data:
    maximum_temperature: t_max__NF.csv
    minimum_temperature: t_min__NF.csv
    average_radiation: rsds__NF.csv
    average_wind_speed: wss__NF.csv

The resulting weather_at_home.yml specifies 100 variants, for instance

....
- name: weather_at_home_56
  description: ''"
  data:
    maximum_temperature: t_max__NF56.csv
    minimum_temperature: t_min__NF56.csv
    average_radiation: rsds__NF56.csv
    average_wind_speed: wss__NF56.csv

Remark 1 Variants are therefore specified by their position in the scenario file. Another approach would be to specify a particular variant, for instance --start=weather_at_home_56,10 to select replicates 56 to 66. It could also be possible to get non-consecutive variants: --variants=population_low,population_high... or specified in a file ? --variants=path_to_file with

$ cat path_to_file
weather_at_home_2
weather_at_home_74
weather_at_home_67
...

Let me know if you think those features are worth implementing now. Remark 2 I just realize I forgot to pad the indices (_1 --> _001). Will do :)

tomalrussell commented 5 years ago

Hi @tlestang - this looks promising, splitting the command is a useful improvement.

Had a quick look at the implementation - do you think it can be reworked only to use the store (and not to touch the filesystem directly)? If so, it would be good to maintain the separation.

Remark 1 - these features all sound surplus to requirements; can be covered by running smif prepare-run a few times as you describe it above.

What does smif prepare-run energy_central weather_at_home do with other scenarios in the model run which have variants?

Remark 2 - Sounds like a good default. I started thinking about passing in a format string then wondered if smif prepare-scenario would be simpler and more flexible with a single argument for the variant string - what do you reckon?

$ smif prepare-scenario weather_at_home <template_value>
$ smif prepare-scenario weather_at_home 027
$ seq -f "NF%02g" 1 10 | xargs -n 1  smif prepare-scenario weather_at_home $1
tlestang commented 5 years ago

What does smif prepare-run energy_central weather_at_home do with other scenarios in the model run which have variants?

It should do nothing. This command considers only variants of the scenario that is passer as an argument.

wondered if smif prepare-scenario would be simpler and more flexible with a single argument for the variant string

Do you mean 1 call to prepare-scenario for 1 variant ? So it is up to the user to iterate over the variants ?

tlestang commented 5 years ago

@tomalrussell I modified prepare_scenario and prepare_run to use only the store and not the underlying file store. I think I did not fully grasp how and why the file store is encapsulated into the upper level Store class... sorry about that! However, some of the checks for existing config files have been removed because they relied on methods of the file store. I detail the changes below.

tomalrussell commented 5 years ago

Hey @tlestang that all sounds reasonable. If you're happy with this, go ahead and open a pull request.

tlestang commented 5 years ago

While testing the store.prepare_scenario() method I noticed that some tests would fail when using the memory interface, although the functionality works fine using the file store.

MemoryConfigStore.update_scenario_variant() does not modify the key by which the updated variant is indexed in the attribute _scenarios[scenario_name]['variants'] dictionary. So if the initial name of the variant is 'low' for scenario 'mortality' and one uses update_scenario_variant(scenario_name, 'low', updated_variant) with updated_variant['name']='new_name' the new variant is still indexed by 'low':

>>> print(store.config_store._scenarios['mortatlity']['variants']['low']['name'])
'new_name'

I think this remark is only relevant for the memory interface, in which the configuration file structure is mocked using dictionaries.

As a consequence, if you use update_scenario_variant(scenario_name, variant_name, variant) or write_scenario_variant(scenario_name, variant), then modifying the variant dictionary afterwards will also modify the underlying MemoryConfigStore._scenarios dictionary, since assignments in Python do not copy objects.. (I still have a hard time with that!) On the other hand, using the file store, modifying the variant dict. will have no effect on the scenario config file itself.

For instance say you want to add two variants based on the 'mortality' scenario variant 'low'.

# Get the variant dict. for template variant 'low'
variant = self.read_scenario_variant('mortatlity', 'low')
for ivar in [1,3]:
           variant['name'] = 'mortatlity_additional_variant_{:03d}'.format(ivar)
           self.write_scenario_variant(scenario_name, variant)

Using the YamlConfigStore, mortality.yml will contain 3 variants low, mortality_additional_variant_001 and mortality_additional_variant_002. That is the expected behavior. However using the memory interface, after the loop the additional variants added to the scenario 'mortality' will have the same name mortality_additional_variant_002.

To recover the expected behavior with the memory interface, it suffices to copy the variant dict

# Get the variant dict. for template variant 'low'
variant = self.read_scenario_variant('mortatlity', 'low')
for ivar in [1,3]:
           variant_copy = deepcopy(variant)
           variant_copy['name'] = 'mortality_additional_variant_{:03d}'.format(ivar)
           self.write_scenario_variant(scenario_name, variant_copy)

However, the test would pass fine without the copy if using the file store, which is the relevant implementation of the Store in practice. The copy is therefore only required for the testing using the memory interface...

tomalrussell commented 5 years ago

Closed by #381