Closed psteinb closed 4 years ago
Here is the pretty printed payload from the query described above
{
"objectIdFieldName": "ObjectId",
"uniqueIdField": {
"name": "ObjectId",
"isSystemMaintained": true
},
"globalIdFieldName": "",
"fields": [
{
"name": "Datum",
"type": "esriFieldTypeString",
"alias": "Datum",
"sqlType": "sqlTypeNVarchar",
"length": 2147483647,
"domain": null,
"defaultValue": null
},
{
"name": "Fallzahl",
"type": "esriFieldTypeInteger",
"alias": "Fallzahl",
"sqlType": "sqlTypeInteger",
"domain": null,
"defaultValue": null
},
{
"name": "ObjectId",
"type": "esriFieldTypeOID",
"alias": "ObjectId",
"sqlType": "sqlTypeInteger",
"domain": null,
"defaultValue": null
},
{
"name": "Sterbefall",
"type": "esriFieldTypeInteger",
"alias": "Sterbefall",
"sqlType": "sqlTypeOther",
"domain": null,
"defaultValue": null
},
{
"name": "Genesungsfall",
"type": "esriFieldTypeInteger",
"alias": "Genesungsfall",
"sqlType": "sqlTypeOther",
"domain": null,
"defaultValue": null
},
{
"name": "Anzeige_Indikator",
"type": "esriFieldTypeString",
"alias": "Anzeige_Indikator",
"sqlType": "sqlTypeOther",
"length": 10,
"domain": null,
"defaultValue": null
}
],
"features": [
{
"attributes": {
"Datum": "7.03.20",
"Fallzahl": 2,
"ObjectId": 1,
"Sterbefall": null,
"Genesungsfall": null,
"Anzeige_Indikator": null
}
},
{
"attributes": {
"Datum": "08.03.20",
"Fallzahl": 2,
"ObjectId": 2,
"Sterbefall": null,
"Genesungsfall": null,
"Anzeige_Indikator": null
}
},
{
"attributes": {
"Datum": "09.03.20",
"Fallzahl": 2,
"ObjectId": 3,
"Sterbefall": null,
"Genesungsfall": null,
"Anzeige_Indikator": null
}
},
{
"attributes": {
"Datum": "10.03.20",
"Fallzahl": 5,
"ObjectId": 4,
"Sterbefall": null,
"Genesungsfall": null,
"Anzeige_Indikator": null
}
},
{
"attributes": {
"Datum": "11.03.20",
"Fallzahl": 5,
"ObjectId": 5,
"Sterbefall": null,
"Genesungsfall": null,
"Anzeige_Indikator": null
}
},
{
"attributes": {
"Datum": "12.03.20",
"Fallzahl": 5,
"ObjectId": 6,
"Sterbefall": null,
"Genesungsfall": null,
"Anzeige_Indikator": null
}
},
{
"attributes": {
"Datum": "13.03.20",
"Fallzahl": 12,
"ObjectId": 7,
"Sterbefall": null,
"Genesungsfall": null,
"Anzeige_Indikator": null
}
},
{
"attributes": {
"Datum": "14.03.20",
"Fallzahl": 18,
"ObjectId": 8,
"Sterbefall": null,
"Genesungsfall": null,
"Anzeige_Indikator": null
}
},
{
"attributes": {
"Datum": "15.03.20",
"Fallzahl": 18,
"ObjectId": 9,
"Sterbefall": null,
"Genesungsfall": null,
"Anzeige_Indikator": null
}
},
{
"attributes": {
"Datum": "16.03.20",
"Fallzahl": 25,
"ObjectId": 10,
"Sterbefall": null,
"Genesungsfall": null,
"Anzeige_Indikator": null
}
},
{
"attributes": {
"Datum": "17.03.20",
"Fallzahl": 35,
"ObjectId": 11,
"Sterbefall": null,
"Genesungsfall": null,
"Anzeige_Indikator": null
}
},
{
"attributes": {
"Datum": "18.03.20",
"Fallzahl": 50,
"ObjectId": 12,
"Sterbefall": null,
"Genesungsfall": null,
"Anzeige_Indikator": null
}
},
{
"attributes": {
"Datum": "19.03.20",
"Fallzahl": 60,
"ObjectId": 13,
"Sterbefall": null,
"Genesungsfall": null,
"Anzeige_Indikator": null
}
},
{
"attributes": {
"Datum": "20.03.20",
"Fallzahl": 97,
"ObjectId": 14,
"Sterbefall": null,
"Genesungsfall": null,
"Anzeige_Indikator": null
}
},
{
"attributes": {
"Datum": "21.03.20",
"Fallzahl": 115,
"ObjectId": 15,
"Sterbefall": null,
"Genesungsfall": null,
"Anzeige_Indikator": null
}
},
{
"attributes": {
"Datum": "22.03.20",
"Fallzahl": 139,
"ObjectId": 16,
"Sterbefall": null,
"Genesungsfall": null,
"Anzeige_Indikator": "x"
}
}
]
}
I could try to have a look at this today evening. Nevertheless, need to see how fast I can move forward using R ;) Would be nice to also use GH Actions for retrieving the data and updating the plots automatically. But this is another issue.
Sure thing. Take your time. Any contribution is welcome. No need to mingle with R
if you don't want to. For this, a csv
is
all I need. Go for GH Actions, they should be made for this.
solution as a shell script… also added other available numbers (deceased,recovered,hospitalized).
problem was to do the query before the one we actually want to process, otherwise status code 400 is returned (here missing authentication). as @psteinb poined out before, the queries can be spotted with the networking view in developer tools of modern browsers. in case other data should be the result of queries they can be added as well.
#!/usr/bin/env sh
assert_tools () {
err=0
while test $# -gt 0; do
which $1 1>/dev/null 2>&1 || {
>&2 printf "tool missing: $1"
err=$(( $err + 1 ))
}
shift
done
test $err -eq 0 || exit $err
}
# test for tools used, sorted: likely to unlikely
dependencies="printf cat cut rev test date sleep curl jq"
assert_tools $dependencies
# for calling existing file, fallback test case
test -z $1 && fn="g.json" || fn="$1"
# could test for existing file
# disable query new file with cli option
test "$2" = "noquery" || {
# user agent
a="Mozilla/5.0 (X11; Linux x86_64; rv:74.0) Gecko/20100101 Firefox/74.0"
# 1. Request
url1="https://services.arcgis.com/ORpvigFPJUhb8RDF/arcgis/rest/services/corona_DD_3/FeatureServer/0/query?f=json&where=Anzeige_Indikator%3D%27x%27&returnGeometry=false&spatialRel=esriSpatialRelIntersects&outFields=*&resultOffset=0&resultRecordCount=50&cacheHint=true"
# 2. Request, will nicht vor 1. => Satus Code: 400
url2="https://services.arcgis.com/ORpvigFPJUhb8RDF/arcgis/rest/services/corona_DD_3/FeatureServer/0/query?f=json&where=Fallzahl%20IS%20NOT%20NULL&returnGeometry=false&spatialRel=esriSpatialRelIntersects&outFields=*&resultOffset=0&resultRecordCount=2000&cacheHint=true"
# alternative tool: wget
# here call is silent, without checks on certificates and with custom user agent (browser name/version etc.)
curl -s -k -A "$a" "$url1" 2>&1 1>/dev/null
# wait a moment to not trigger some protection on server side
sleep 1 # although this doesnt seem to be necessarry at all
# write actual content to file $fn
curl -s -o "$fn" -k -A "$a" "$url2"
}
# information on file names, print to stderr
1>&2 echo "Reading from: $fn"
test -f "$fn" || { 1>&2 echo "file does not exist."; exit 1; }
tf=$(date +%Y-%m-%dT%H:%M:%S)"_"$(echo "$fn"|rev|cut -d. -f2-|rev)".csv"
1>&2 echo "Writing to: $tf"
# check for file extension
test "$(echo "$fn"|rev|cut -d. -f1|rev)" = "json" && {
# head line as found in csv of the repo
1>"$tf" echo "city,date,tod_hhmm,diagnosed,deceased,recovered,hospitalized"
# csv header as used in R script
#~ 1>"$tf" echo "city,date,tod_hhmm,diagnosed"
# following values are assumed constant
# standard values for place and time
p="Dresden"
c="12:00" # time of date, format H:M
# process lines (each day) of the files, filter json with `jq`
for i in $(cat $fn | jq -c '.features[] .attributes | { Datum, Fallzahl, Sterbefall, Genesungsfall, Hospitalisierung }');
do
# numbers, may need fix in case NAN
# n: number of cases
n=$(echo $i|cut -d} -f1|cut -d, -f2|cut -d: -f2); test "$n" = "null" && n="0"
# b: deceased
b=$(echo $i|cut -d} -f1|cut -d, -f3|cut -d: -f2); test "$b" = "null" && b="0"
# r: recovered
r=$(echo $i|cut -d} -f1|cut -d, -f4|cut -d: -f2); test "$r" = "null" && r="0"
# h: hospitalized
h=$(echo $i|cut -d} -f1|cut -d, -f5|cut -d: -f2); test "$h" = "null" && h="0"
# t: timestamp
t=$(echo $i|cut -d} -f1|cut -d, -f1|cut -d: -f2|cut -d'"' -f2)
# split timestamp in parts, if leading zero, reduce to single digits
# d: day
d=$(echo $t|cut -d"." -f1); test "$(echo $d|cut -c1)" = "0" && d=$(echo $d|cut -c2)
# m: month
m=$(echo $t|cut -d"." -f2); test "$(echo $m|cut -c1)" = "0" && m=$(echo $m|cut -c2)
# y: year
y=$(echo $t|cut -d"." -f3-); test "$(echo $y|cut -c1)" = "0" && y=$(echo $y|cut -c2)
# fix year to 4 digits
test "$y" -lt "100" && y=$(( 2000 + $y ));
# output for csv
# including more numbers
1>>"$tf" printf "%s,%04d-%02d-%02d,%s,%s,%s,%s,%s\n" "$p" "$y" "$m" "$d" "$c" "$n" "$b" "$r" "$h"
# csv as used by R script, number infected only
#~ 1>>"$tf" printf "%s,%04d-%02d-%02d,%s,%s\n" "$p" "$y" "$m" "$d" "$c" "$n"
done
}
@vv01f Dresden just upgraded their data exposure. Would your code be capable of handling that?
@psteinb as it pulles the unchanged JSON (keywords did not change, the new ones have been added already) it continues to work.
current result:
city,date,tod_hhmm,diagnosed,deceased,recovered,hospitalized
Dresden,2020-03-07,12:00,2,0,0,0
Dresden,2020-03-08,12:00,2,0,0,0
Dresden,2020-03-09,12:00,2,0,0,0
Dresden,2020-03-10,12:00,5,0,0,0
Dresden,2020-03-11,12:00,5,0,0,2
Dresden,2020-03-12,12:00,5,0,0,2
Dresden,2020-03-13,12:00,12,0,0,2
Dresden,2020-03-14,12:00,18,0,0,2
Dresden,2020-03-15,12:00,18,0,0,3
Dresden,2020-03-16,12:00,25,0,0,4
Dresden,2020-03-17,12:00,35,0,0,5
Dresden,2020-03-18,12:00,50,0,0,6
Dresden,2020-03-19,12:00,60,0,0,8
Dresden,2020-03-20,12:00,97,0,0,9
Dresden,2020-03-21,12:00,115,0,0,10
Dresden,2020-03-22,12:00,139,0,0,11
Dresden,2020-03-23,12:00,154,1,0,14
Dresden,2020-03-24,12:00,167,2,0,17
Dresden,2020-03-25,12:00,216,2,0,17
Dresden,2020-03-26,12:00,239,2,0,17
Dresden,2020-03-27,12:00,287,2,0,23
Dresden,2020-03-28,12:00,314,2,0,25
Dresden,2020-03-29,12:00,320,2,0,28
Dresden,2020-03-30,12:00,339,3,0,37
Dresden,2020-03-31,12:00,358,3,0,43
@psteinb I added it as a gh action to see here… https://github.com/vv01f/covid19-curve-your-city/runs/548655100?check_suite_focus=true file content is shown in the log.
Super Cool, I'd love to merge this into this repo and pull the data every afternoon. I suggest to store the csv inside
/repo-root/data/de_dresden_www.csv
I am unfamiliar with gh actions. Not sure if you could send a PR and I'll merge that?
that makes the two of us. I will try to find out how this works…
better idea: commit from within the ci PR sent.
The data came in perfect today. Thanks for this wonderful PR. Now I don't have to deal with this. Much appreciated. ;)
Closed by #19
today no data available in between noon or 20 past. I expect that to change in 5h. right now logging every 5min to determine when they publish. then an additional schedule for saturday (and maybe sunday) can be added.
it's totally fine as it is. I am super thankful for this wonderful contribution.
If I saw correctly, the plot on dresden.de pulls the data from a server to visualize it.
Using the firefox dev tools, I see this:
with the following params in the query string
The response payload is this:
As we can see, the city of dresden has 2 more categories "Sterbefall" and "Genesungsfall".
At best, I need a script that automatically pulls this data and converts it to csv for Dresden!