selfdeceited / moscow-covid-19-awareness-stats

Crude calculation trying to detect real number of infected people
https://moscow-covid-19-awareness-stats.now.sh/
Apache License 2.0
1 stars 0 forks source link

Collaboration request #1

Open sergei-mironov opened 4 years ago

sergei-mironov commented 4 years ago

Hi. I found your repo on a GitHub by 'moscow covid' keywords. I attempt to maintain a repository on the similar topic:

https://github.com/grwlf/COVID-19_plus_Russia

In this repo I'm trying to update CSSE repo with Moscow/Spb data (please see the readme). I'd be happy to join attempts in monitoring per-city covid in Russia.

selfdeceited commented 4 years ago

Hi, that's great! I checked your fork.

Might I ask how do you fetch the data? By now I wasn't able to find any open API with per city stats, so it looks like you do the manual updates.

As a fallback I think I may use csv from your repo to fetch this data instead of updating them every time, but still any open API I guess would suffice. Are you by chance aware of any - for data in Russia and Moscow, of course?

selfdeceited commented 4 years ago

correcion: 'manual updates' = yandex sensor ofc still raw set of csvs does not allow to query all kinds of data

sergei-mironov commented 4 years ago

Hi. I described the data sources and procedure that I follow in the readme, thanks for the important question. Please check

https://github.com/grwlf/COVID-19_plus_Russia#data

In short, I currently fetch the data from Yandex COVID map, parse the html, save the data in the CVS format which CSSE institute currently use. One problem is that CSSE changes their data format from time to time, but I think that the right thing to do is to follow them nevertheless.

As a fallback I think I may use csv from your repo to fetch this data instead of updating them every time, but still any open API I guess would suffice. Are you by chance aware of any - for data in Russia and Moscow, of course?

Unfortunately, I don't know of any machine-readable official APIs for Moscow/Russia. I think that the most probable places where such API may appear are: Yandex (https://yandex.ru/maps/covid19 ) or 2GIS ( https://covid.2gis.ru ) - the most known geolocation services in Russia.

selfdeceited commented 4 years ago

Source-wise, I think it makes sense to maintain a single source of truth which shall be CSVs in your repo since this activity is required either way. I already considered parsing Yandex HTML, but I don't think it makes sense to do the double job.

Although I would prefer for Yandex or 2GIS to share the API on their own, in order not to be blocked I'm thinking on writing some very primitive one that is gonna be based on https://github.com/grwlf/COVID-19_plus_Russia data (cache it + invalidate once a day / on webhook when updated in your repo) with something like this available:

    `/v1/api/${region}` # stats for all time per city/region
    `/v1/api/${region}/${date}` # stats for specific city/region on specific date
    `/v1/api/${date}` # stats for all RU regions on specific date

I'm open to any better ideas and suggestions :)

sergei-mironov commented 4 years ago

Glad to know this!

sergei-mironov commented 4 years ago

I've just added access.py containing a pandas API for loading the data. I attempted to convert early data to the up-to-date CSV format used by the upstream.