swb-ief / etl-pipeline

The Covid Lens
1 stars 10 forks source link

Create Task: Bengaluru ward (zones) data processing #116

Closed Nozziel closed 3 years ago

Nozziel commented 3 years ago

Scrape the latest pdf from https://bbmp.gov.in/warroombulletin.html We can probably figure out what the correct URL is URL's are a bit inconsistent... maybe scan the folder for it is accesable and so is the parrent folder parent folder and use python beatifullsoup package to scrape the filetables?

Pg 6 - Fatalities (delta) needs to be computed using the Total Fatalities. Pg 9 - Tests (sum of RT-PCR + RAT), TPR available. Confirmed cases to be computed using Tests * TPR. Note - TPR is a percentage.

And add it to the fetch_ward_data.py task

Note: create an extract_bengaluru_wards_pdf.py file in the backend package

Nozziel commented 3 years ago

no longer needed