epidemics / covid

epidemicforcasting.org visualization repository
http://epidemicforecasting.org
GNU Affero General Public License v3.0
20 stars 12 forks source link

More customizable GLEAMviz definitions #444

Open wolverdude opened 4 years ago

wolverdude commented 4 years ago

From the Data Engineering Roadmap

Currently, GLEAMviz parameters are being set in this collab. But the only parameters that are currently being set are β and seasonality. Lists of possible values are set in configuration variables, and then a matrix of different scenarios is created.

It's somewhat trivial to set other parameters, but the script is set up in a way that makes customization cumbersome. I would be helpful to refactor this function (and potentially definition.py) to make fewer assumptions about the different GLEAM traces desired while making it easy to configure scenarios with different sets of parameters.

The ideal would be for all parameters to be set from the specified scenarios spreadsheet (example for Pakistan, Gdoc spec).

wolverdude commented 4 years ago

@lagerros I created this based on our conversation Monday. The specification is a bit vague (could just be my memory being fuzzy), but I believe this is what you wanted. Is there a person on the modeling team I can contact for nailing down the acceptance criteria?

lagerros commented 4 years ago

Check out the "Multiple parameter spec" tab. https://docs.google.com/spreadsheets/d/1IxPMadPxjnphWSKG_6PxmsrCLoXe3cHGp1Ok9kcddPk/edit#gid=1831691945

There's a new column with a toggle where you can select the parameter you want to change, for each row.

I don't know how definition.py works, so I don't know the extent to which refactoring is needed to enable this.

Some nuances:

1) Parameters for "Seasonality" and "Airline traffic" can only have one value throughout the entire simulation and all locations. So they can only be set once for each "Class" in the spreadsheet, and there's no need to provide a location or start date & end date. (Whereas beta, epsilon, mu and imu can vary between locations and times within a single simulation.)

2) Here's a flag for a potential upcoming feature, though it is as of yet uncertain. Currently there's a distinction between:

At the moment, "Traces" are specified in the colab: image

However, this means that all trace settings are taken as a Cartesian product with all "Classes" in the spreadsheet. In future, we might want to allow defining different traces for each Class, from within the spreadsheet. (For example, you might want to say that in the "Strong" scenario we visualise uncertainty over incubation period; whereas in the "Weak" scenario we visualise uncertainty over Seasonality.)

Enabling this functionality requires thinking through what the best implementation would be, and I haven't done that yet. Suggestions welcome.


Is the above clear enough as a spec?

If there are any questions, I believe asking Elizabeth ( @AcesoUnderGlass ) could help with a lot of them.

wolverdude commented 4 years ago

Okay, so for this spec, we do not want to change the BETA_MULTIPLIERS, SEASONALITIES cartesian product, but this may happen in the future.

What we do want for this spec is to implement all the options in that sheet except for the Trace column. The thing that remains unclear to me is how to resolve overlapping values for the same parameter. Is it just first/last wins? Is there a hierarchy based on region? Also, it seems likely that we'll want different combination rules for different kinds of parameters (e.g. monsoon should act like seasonality and modify the betas for everything else within its window).

Other questions:

I think it makes sense to implement this spec before #443, since you'll need some kind of interface to configure monsoons, and once #443 is implemented, we can just make it another kind of settable parameter.

lagerros commented 4 years ago
hnykda commented 4 years ago

Is this still "blocked" or can be worked on?

wolverdude commented 4 years ago

Sorry, I forgot to update the card state. I should have a PR up tomorrow.

wolverdude commented 4 years ago

@lagerros I've updated the example spreadsheet with examples of what you can now configure. Please review and let me know if this is what you had in mind and what tweaks, if any, you would like me to make.

lagerros commented 4 years ago

@wolverdude I took a look -- this is really neat!

Especially nice is that having all those parameters there would mean different simulations could just be stored in different spreadsheets; and there would be no need to edit the colab. (Which is an annoying feature at the moment)

One thing I'm confused about is the "Background conditions" -- are they used in a Cartesian product to generate traces? Or do they generate groups? (C.f. the distinction in my message above)

wolverdude commented 4 years ago

Countermeasure package classes correspond to different scenarios/groups, one group per class. Background condition classes correspond to different Traces, one per class. Or if you want, it can be the other way around. Easy to change.

I basically copied the logic from the colab, so the behavior shouldn’t be much different. The one thing I did differently was lumping beta multipliers in with background conditions. I set it up this way based on what I was seeing on the current Balochistan site. If that’s undesirable, I can create a third Type for another Cartesian product.

wolverdude commented 4 years ago

One question I have for you (and the thing that kinda derailed me earlier):

Does this spreadsheet config supersede the scenarios and groups keys in the config.yaml? If not, then what should the info in config.yaml be used for?

wolverdude commented 4 years ago

@lagerros any thoughts on these? ^

lagerros commented 4 years ago

Does this spreadsheet config supersede the scenarios and groups keys in the config.yaml? If not, then what should the info in config.yaml be used for?

I don't know what you mean by "supersede".

But basically, the group keys in config.yaml should be entirely determined by what's in the spreadsheet (as should the names of traces in the legend).

Being able to make that easier; for example by downloading a config file from the colab, or something else, is on the data engineering roadmap and could be very helpful.

wolverdude commented 4 years ago

Yeah, "supersede" means replace. The thing is, the functionality in the spreadsheet could replace both the scenarios and groups keys in the config, except for the names and descriptions.

I think what I'll do is have the new colab export the config or something.

wolverdude commented 4 years ago

Okay, I think I've got it. How about we just use "Group" and "Trace" as the values for the Type column, which should make it super clear how things are getting displayed. All parameters will be configured in the spreadsheet, but display names, etc. need to be placed in the config file for export. It would look like this:

scenarios:
  config_sheet: "https://docs.google.com/spreadsheets/d/abc/edit#gid=123"
  groups:
    - name: Weak Mitigation
      description: Mostly open borders; full opening of public places; no social distancing; little compliance with hygiene advice
    - name: Moderate Mitigation
      description: Some border closure; closure of schools; ban on public gatherings; partial social distancing
    - name: Strong Mitigation
      description: Strong external and internal border closure; full closure of public places (including places of worship); social distancing outside and within homes
    - name: Recommended Mitigation
      description: Moderate measures + compulsory masks; contact tracing; social distancing at places of worship and within homes
  traces:
    - name: "Slowest"
      description: Very slow spread (50%)
    - name: "Slower"
      description: Slower spread (75%)
    - name: "Expected"
      description: Expected spread (100%)
    - name: "Faster"
      description: Faster spread (150%)
    - name: "Fastest"
      description: Very fast spread (175%)

name values correspond to the values in the Class column, and description is what's displayed to the user, though for groups, name will also be the title of the tab for users to click on. If we want to decouple that, it would be pretty easy to add an optional display_name to the config.

lagerros commented 4 years ago

Note that this assumes groups and traces are independent -- and I don't think that's assumed in the spreadsheet. (Feature, not bug)

This feature is not used at the moment; but might become useful in future.

But this might be fine as an MVP, and we can jump off that bridge when we get there (unless you can already seen an easy fix for this now).

wolverdude commented 4 years ago

@AcesoUnderGlass @lagerros This issue was completed by epimodel#57; however, there are some design decisions and areas of the spec I never fully clarified. These are things you may consider changing.

Unused Code

There is some code that I wrote and checked in that ended up not getting used and I neglected to remove it.

Gleam Parameters