act-now-coalition / covid-data-model

Data backend providing computed data for the graphs displayed at https://covidactnow.org
https://covidactnow.org/
MIT License
149 stars 57 forks source link

Parallelize Update Combined Datasets #1315

Closed smcclure17 closed 2 years ago

smcclure17 commented 2 years ago

Update datasets in parallel by partitioning the bulk MultiRegionDataset into individual state shards. Currently, this runs the "Update Combined Datasets" action in ~38 minutes (with refreshing datasets) compared to the previous ~80-100 minutes.

Snapshot built using this PR: https://covid-projections-git-bump-snapshot-3560-2346-covidactnow.vercel.app/internal/compare/?left=3558&locations=0&metric=9&right=3560&sort=2