corona-zahlen-landkreis / corona_landkreis_fallzahlen_scraping

Scraping Germany's local districts websites for newer corona-case-numbers!
GNU General Public License v3.0
17 stars 9 forks source link

Convert landkreise.ods to landkreise.csv #64

Open dasmur opened 4 years ago

dasmur commented 4 years ago

I think, it would be really awesome to convert the overview sheet from .ods to .csv.

Advantages:

dadosch commented 4 years ago

we actually had it as csv, but I think it was too large for github to display it (but I may be wrong here, I can't remember for sure.

Btw, I'm curios: How did you find this project? ;)

dasmur commented 4 years ago

@dadosch Honestly, I had to check if github actually has a file size limit, because I totally missed this aspect and indeed, there exists a limit at 512 KB (it is even mentioned within the referenced page). Afterwards, I did a quick test and converted the current sheet locally to .csv which produced a file with approx. 17KB, hence this should be fine for github. But even if this would exceed the limit at some point in future, I think the second aspect (plain-text) should not be underestimated. Since this file has to be modified for every new county added, errors can easily be introduced without noticing it and it can be really hard to track them down.

dasmur commented 4 years ago

Regarding your second question (me find this project): Actually, I planned to use R to do some statistics on current corona data. In the beginning, I tried to use the RKI provided data on arcgis, but I was not able to identify the corresponding API documentation in a bearable amount of time. Hence, I searched for projects on github using the API (to be more precisely, I searched for any project being related to "corona germany" ;) ) in order to understand the usage. Even though, your project was not directly related to this goal, I liked the general idea.

dadosch commented 4 years ago

Regarding your second question (me find this project): Actually, I planned to use R to do some statistics on current corona data. In the beginning, I tried to use the RKI provided data on arcgis, but I was not able to identify the corresponding API documentation in a bearable amount of time. Hence, I searched for projects on github using the API (to be more precisely, I searched for any project being related to "corona germany" ;) ) in order to understand the usage. Even though, your project was not directly related to this goal, I liked the general idea.

Btw, do you know the efforts of "Risklayer, Karlsruhe"? They crowdsource every Kreis each day (without scripts, just humans) https://docs.google.com/spreadsheets/d/1wg-s4_Lz2Stil6spQEYFdZaBEp8nWW26gVyfHqvcl8s/edit