corona-zahlen-landkreis / corona_landkreis_fallzahlen_scraping

Scraping Germany's local districts websites for newer corona-case-numbers!
GNU General Public License v3.0
17 stars 9 forks source link

consolidate / compare with ZEIT ONLINE's effort #47

Closed jgehrcke closed 4 years ago

jgehrcke commented 4 years ago

It looks like ZEIT ONLINE has worked pretty hard on a similar approach, given their new data set / visualization: https://github.com/jgehrcke/covid-19-germany-gae/issues/43

Would be cool to have two projects doing this. This one here would be much more transparent than ZEIT ONLINE (who probably do a good job, but still!), a better source. Don't give up! :) :rocket:

Their current sum validates the idea (I hope): LK data is more fresh. However, right now, it's "only" ~1000 cases not yet in the Landes-Data. Does that sound about right? A lot can be learned... :)

dadosch commented 4 years ago

I'm actually really curious, how they collect the data. There are a few landkreise who put up images of tables (yes, you read right).

I think we need to have a more stable way to parse the websites (especially the time data easily breaks if the LK changes something.)

Also we need more contributors, even if they contribute only 1 each, or sort which LK is parsable (see project tab)

(Note, that they drop the leading zero in the AGS https://de.wikipedia.org/wiki/Amtlicher_Gemeindeschl%C3%BCssel#Aufbau)

Also there are LK who don't publish data like hochtaunuskreis, 06434