Phrytes / COVID19_RKI_Germany

A short script that can be used to download data about COVID19 cases in Germany from Robert Koch Institute
MIT License
4 stars 3 forks source link

Tip: use rest api #1

Open SIRprise opened 4 years ago

SIRprise commented 4 years ago

Use an URL like this https://services7.arcgis.com/mOBPykOjAyBO2ZKk/arcgis/rest/services/RKI_COVID19/FeatureServer/0/query?where=Meldedatum+%3E+%28CURRENT_TIMESTAMP+-+3%29&objectIds=&time=&resultType=none&outFields=*&returnIdsOnly=false&returnUniqueIdsOnly=false&returnCountOnly=false&returnDistinctValues=false&cacheHint=false&orderByFields=Meldedatum&outStatistics=&having=&resultOffset=&resultRecordCount=&sqlFormat=none&f=json&token= The challenge is to make queries with an resultset of <2000 and to sum up the landkreise to the bundesländer the right way. I have problems with that. I guess some landkreise doesn't report every day, so I cannot simply group by bundesland and date.

change f=json to f=html to see the options

Phrytes commented 4 years ago

Thanks, didn't see your post earlier. I will try the API soon. Have you also seen this page on Wikipedia?

https://en.wikipedia.org/wiki/2020_coronavirus_pandemic_in_Germany#Robert_Koch_Institute

This is way more complete than my small collection.

SIRprise commented 4 years ago

Ah, thanks! Here is another data source and a nice visualization: https://covid-19.openmedical.de/

Phrytes commented 4 years ago

Nice indeed, thanks!

averissimo commented 4 years ago

Awesome @SIRprise

I've been using data from covid19-eu-zh/covid19-eu-data, but would really like to look at Bayern alone.

Something around your suggestion might do it! At least until that data repository also adds this data in their pipeline (it's in their todo list)

My idea is to update this report: https://averissimo.github.io/covid19-analysis/germany.html

@Phrytes thanks for posting on reddit! found this via your post

averissimo commented 4 years ago

@SIRprise you can always use offset and maxrecordcount to download all data in chunks

Anyway, the full dataset is now available at averissimo/covid19-de_rki-data

Phrytes commented 4 years ago

@averissimo Great, thanks! Do you think that you can safely sum the district data in order to get the numbers for the Bundesländer? I can't remember this reddit post. Are you sure it was mine? Do you maybe have an url?

averissimo commented 4 years ago

I think so, as the data seems to be consistent across the RKI data available with some errors though.

RKI data from federal states (dataset from covid19-eu-zh/covid19-eu-data that retrieves from RKI) vs from districts (dataset from ARCGIS downloaded by me)

reddit post. If it wasn't you then it at least got me here! :-)

edit: The second link was corrected as it was pointing to a wrong section on the report

averissimo commented 4 years ago

@Phrytes there's a caveat for this data.

Values for the last few days are under reported. There's some discussion on the service page

DGrothe-PhD commented 4 years ago

Awesome @SIRprise

I've been using data from covid19-eu-zh/covid19-eu-data, but would really like to look at Bayern alone.

Here is a clear tabular data for Bavarian Regierungsbezirke and Landkreise https://www.lgl.bayern.de/gesundheit/infektionsschutz/infektionskrankheiten_a_z/coronavirus/karte_coronavirus/index.htm

I just tried to gather data from that but I found out we got to somehow "teach pandas to only look at the table which start with <table id="tableLandkreise" class="mb-1 table_sticky"> - if we could include jquery in python that might do the trick :thinking: