write code to generate ensemble

nickreich commented 1 month ago

We will want a script that can run (ideally automatically, via CI) and take all of the model_output files for a given week, and build an ensemble for them.

The script could live in the src folder, or it could live external to this repo.

The script should generate the file and save it in an appropriate hub-ensemble folder or some such place. I will file an issue to create the CI to "submit" the file as a separate issue.

elray1 commented 3 weeks ago

Have we discussed what we want this ensemble to do, in terms of statistical methods?

elray1 commented 3 weeks ago

We decided that this ensemble will be a linear pool:

for mean predictions, submit the mean of the means (and for any team that didn't submit means, extract means from the submitted samples)
for sample predictions, from each of the M contributing models choose 100 / M samples at random, randomly distributing any remainder in the number of samples across the models

Misc. other ideas for later analyses:

what if we randomly selected the samples rather than stratifying by model? our guess is that this will have larger MC variability
what if we repeat the random selection multiple times? what is variability in ensemble score? note, we could also bootstrap individual model samples to try to get at this
what if we took all M*100 samples?

reichlab / variant-nowcast-hub

write code to generate ensemble #76