Closed smcclure17 closed 2 years ago
do you know why we have ~12k flow runs but ~54k task runs?
Yeah, this is because each flow usually has 5-6 tasks each. We could safely remove the validate
task from each since that doesn't actually do anything (we never got around to implementing that in a working/cohesive manner IIRC, so we just disabled the code in it)
Decreases the number of scrapers executed in the
MainFlow
s from 114 to 75. The majority of the remaining scrapers are state-specific demographic vaccine scrapers.This disables any scrapers not currently used further downstream in the pipeline, including those that don't collect timeseries data. So, if we want to re-activate those scrapers in the future we will be missing chunks of timeseries data, but I find it unlikely that we'll do so (and I don't think it's a huge deal considering the stable/stagnant nature of the vaccine data).