Open kermidt opened 4 years ago
Hi,
The cumulative count of confirmed cases for 4th of May is off by ~400 when comparing against: https://github.com/CSSEGISandData/COVID-19 or https://docs.google.com/spreadsheets/d/1ierEhD6gcq51HAm433knjnVwey4ZE5DCnu1bW7PRG3E/htmlview?usp=sharing
From https://raw.githubusercontent.com/dtandev/coronavirus/master/data/CoronavirusPL%20-%20Timeseries.csv I get:
05/01 13196.0 05/02 13473.0 05/03 13572.0 05/04 13585.0
From https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv I get:
5/1/20 13105 5/2/20 13375 5/3/20 13693 5/4/20 14006
script:
import pandas as pd import io import requests import numpy as np url='https://raw.githubusercontent.com/dtandev/coronavirus/master/data/CoronavirusPL%20-%20Timeseries.csv' def plStrToDate(x): day, month, year = map(int, x.split("-")) return '%02d/%02d' % (month, day) s = requests.get(url).content d = pd.read_csv(io.StringIO(s.decode('utf-8'))) d = d.drop('Age', axis=1) d['Timestamp'] = d['Timestamp'].map(plStrToDate) d['Confirmed'] = d['Infection/Death/Recovery'] == 'I' d['Recovered'] = d['Infection/Death/Recovery'] == 'R' d['Deaths'] = d['Infection/Death/Recovery'] == 'D' d = d.groupby(['Timestamp']).sum() d = d.cumsum() d
Hi,
The cumulative count of confirmed cases for 4th of May is off by ~400 when comparing against: https://github.com/CSSEGISandData/COVID-19 or https://docs.google.com/spreadsheets/d/1ierEhD6gcq51HAm433knjnVwey4ZE5DCnu1bW7PRG3E/htmlview?usp=sharing
From https://raw.githubusercontent.com/dtandev/coronavirus/master/data/CoronavirusPL%20-%20Timeseries.csv I get:
From https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv I get:
script: