cmu-delphi / covidcast-indicators

Back end for producing indicators and loading them into the COVIDcast API.
https://cmu-delphi.github.io/delphi-epidata/api/covidcast.html
MIT License
12 stars 17 forks source link

convert claims hospital and quidel to chunk backfill into month instead of 28 days #2071

Open aysim319 opened 3 weeks ago

aysim319 commented 3 weeks ago

The current behavior for claims hospital and quidel is that the source data is written to a parquet file and merged every 28 days, when a patch needs to be done, the parquet files also need to be patched which involves significant increase in complexity. (multiple files needs to be modified and the process is hard to automate as is)

The current process poses a problem in maintainability. The backfill is a system parallel to the indicator that requires separate processes to maintain. Furthermore, the specific format that the parquet files are created and updated significantly increases the complexity of patching. Given the current process when there is an outage, multiple files that are originally unrelated to the outage dates have to be modified in a cascading fashion. As a result, extra time and manual intervention needs to be taken for patching signals.