swb-ief / etl-pipeline

The Covid Lens
1 stars 10 forks source link

Daily ward-wise cases correction #144

Closed SangeetaJayadevan closed 3 years ago

SangeetaJayadevan commented 3 years ago

Computing delta cases becomes incorrect when there is a gap of 2 or more days of scraped data for wards. Hence one of these 2 alternative methods could be used: 1. Most accurate option: 'Ward-wise new cases' page 24 on the pdf file has a table of the last 7 days of cases across the 24 wards. This table could be read to refresh the data for the last 7 days.
2. Imputation option: Based on the date difference between the last day when ward data is available and current date, impute the daily cases by dividing the delta by the number of days between the 2 dates.

For fatalities, option 2 needs to be used as there is no other data source.

Note : Option 2 is a reasonable work around if the gap is generally only 2-3 days. If the gap tends to exceed exceeds 3 days, then we need to go for OPtion 1.