Closed vukosim closed 4 years ago
Hey @cishiv you might be able to help with this one.
Sure. If no one else gets to it before me, I'll take a took @vukosim
Will probably only be during the weekend though.
NO problem @cishiv. lets get it right.
@vukosim , @cishiv what is the status of progress on this feature. I'm happy to assist, but don't want to duplicate effort. Let me know then I'll either get cracking on this ticket or pickup another issue
\ feature
from the repo
Assigned @Ari-Ramkilowan you can pick it up.
@vukosim Progress on this feature ...
I have written some python code to extract date
, province
and infection count
for each of the whatsapp text files from NDOH.
Whenever the notebook is executed, it will look inside the relevant data folder, if a previously unprocessed .txt
file exists it will then extract the information from the infection count breakdown
section of the WhatsApp and store it in a .csv
.
I'm not 100% clear on how this data is to be used, so does it make sense to just create a .csv
for every .txt
file ? or is the aim to have one .csv
that is continually updated ?. I think the former is my favoured approach ( even though I currently have a single csv with all the data extracted - it might become harder to maintain this approach in the long run though).
Let me know what you'd like from this feature, I'll tidy up the notebook and make a PR.
As an example of what we currently have available, performing a groupby on the data extracted, yields the ffg output
by_province.get_group('KZN')
Ahh. Thanks. This will be very helpful @Ari-Ramkilowan
@vukosim . I Just updated the notebook to get the difference in infection count for any two given dates (for which data exists). sample output below :
diff_by_dates(df, '2020-04-13','2020-04-10')
gives
@vukosim : PR sent - after it gets approved I'll hop onto the next issue
Thanks @Ari-Ramkilowan
Is your feature request related to a problem? Please describe. Currently, we are only getting numbers from the NICD/DoH in terms of final numbers. One place we can get this data is their Whatsapp information service that then gives daily numbers after the update.
We need a solution to check 2 Whatsapp updates, calculate the difference and create the CSV.
The Whatsapp messages are now stored in data/doh_whatsapp/ as .txt files
Describe the solution you'd like Process 2 consecutive Whatsapp .txt files and then output the CSV that has the
confirmed.csv template. Similar to the scraper.
Example from the Whatsapp line
extracted into .txt file example below
Current Status of Cases of COVID-19 in South Africa 24 MARCH 2020 - 11:28am
Total cases: 554 153 New cases 2 Full recovery (Confirmed Negative and cleared for returning home)
0 Deaths
The breakdown per province of total infections is as follows: 302 Gauteng 130 Western Cape 80 KwaZulu Natal 18 Free State
5 North West 9 Mpumalnaga 4 Limpopo 2 Northern Cape 2 Eastern Cape
Current projections estimate that the virus could effect 60% of South Africa's citizens at some point, but not at the same time. Most South Africans will only experience mild symptoms and humans are capable of developing immunity to the virus.
The National Department of Health will now be releasing results as they are submitted by both private and public laboratories. In instances where NDOH confirmatory tests yield different results, the public will be duly informed.
TEST RESULTS OF CITIZENS REPATRIATED FROM WUHAN: All the citizens from Wuhan were tested and their results came back negative for COVID-19. They will continue to be kept in quarantine for the prescribed period and will thereafter be reunified with the community.