ABI-Covid-19 / moh-data

Apache License 2.0
0 stars 1 forks source link

Using the Johns Hopkins database #9

Open mahyar-osn opened 4 years ago

mahyar-osn commented 4 years ago

Switching to the Johns Hopkins database as there is more info available (e.g. # recovered and # deaths etc.). Also, the format seems to be more consistent hence avoiding constant changes to the code.

EDIT: We still need to keep MoH in the code to fetch the probable and confirmed cases.

agarny commented 4 years ago

Just a reminder, we need/want data for:

mahyar-osn commented 4 years ago

We can fetch these from Johns Hopkins by the look of it. Hopefully, I will have a new update to the code by the end of today to test with seir.

agarny commented 4 years ago

From what I can tell, the JHU data doesn't distinguish between confirmed and probable cases. In fact, confirmed+probable = confirmed for them...

mahyar-osn commented 4 years ago

You're right. I just went through the other csv files and seems like only confirmed is available. Thus will use JHU for the cumulative number of deaths and cumulative number of recovered cases + MoH for probable and confirmed cases (and also to confirm the validity of JHU in case if there were discrepancies).

mahyar-osn commented 4 years ago

Just edited the issue.

agarny commented 4 years ago

Otherwise, and FWIW, I have been populating a Google Sheets since the beginning of Covid-19 in NZ: https://docs.google.com/spreadsheets/d/16UMnHbnBHju-fK45aSdaJhVmrXJpy71oxSiN_AvqV84. There is a way to access that data from Python (see https://towardsdatascience.com/accessing-google-spreadsheet-data-using-python-90a5bc214fd2) and I have started implementing it, but it's probably going to be only for me to test things.

agarny commented 4 years ago

Otherwise, and FWIW, I have been populating a Google Sheets since the beginning of Covid-19 in NZ: https://docs.google.com/spreadsheets/d/16UMnHbnBHju-fK45aSdaJhVmrXJpy71oxSiN_AvqV84. There is a way to access that data from Python (see https://towardsdatascience.com/accessing-google-spreadsheet-data-using-python-90a5bc214fd2) and I have started implementing it, but it's probably going to be only for me to test things.

Actually, scrap that. This requires authentication and although I got it working, it needs on a client_secret.json file, which we shouldn't really share publically. This being said, my Google Sheets is accessible to anyone who has the URL, so we can simply get the data using:

import requests
response = requests.get(url='https://docs.google.com/spreadsheets/u/1/d/16UMnHbnBHju-fK45aSdaJhVmrXJpy71oxSiN_AvqV84/export?format=csv&id=16UMnHbnBHju-fK45aSdaJhVmrXJpy71oxSiN_AvqV84&gid=0')
print(response.content)
mahyar-osn commented 4 years ago

Thanks @agarny. I would also like to add that spreadsheet as an extra source.

Meanwhile, I created a new PR related to the current issues.

agarny commented 4 years ago

Thanks @agarny. I would also like to add that spreadsheet as an extra source.

I intend to update that Google Sheets as we go, so sure feel free to use as an extra source.

Meanwhile, I created a new PR related to the current issues.

Thanks, am going to check it out.