Closed vpontis closed 4 years ago
I took a look at JHU data and I think we can do the following:
This script below handles name conversions by:
I think if you can apply this prior to your script it should mostly do it?
from csv import DictReader, DictWriter
PROVINCE = "Province/State"
REGION = "Country/Region"
src = DictReader(open("time_series_19-covid-Confirmed.csv"))
dst = []
for r in src:
if r[REGION] == "Cruise Ship":
r[PROVINCE] = "Cruise Ship"
if "," in r[PROVINCE] or "Princess" in r[PROVINCE]:
continue
if r[REGION] == "Taiwan*":
r[PROVINCE] = "Taiwan"
if r[REGION] == "Korea, South":
r[PROVINCE] = "South Korea"
if r[REGION] in ["France", "Denmark", "Netherlands", "United Kingdom"] and r[PROVINCE] != r[REGION]:
r[REGION] = r[PROVINCE]
if r[PROVINCE] in ["Hong Kong", "Macau", "Puerto Rico", "Guam", "Virgin Islands"]:
r[REGION] = r[PROVINCE]
if r[PROVINCE] == "Georgia" and r[REGION] == "US":
r[PROVINCE] = "Georgia (US)"
if r[PROVINCE] == "":
r[PROVINCE] = r[REGION]
if r[REGION] == "US":
r[REGION] = "United States"
dst.append(r)
w = DictWriter(open("out.csv", "w+"), fieldnames=src.fieldnames)
w.writeheader()
w.writerows(dst)
Pull data from: https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_time_series
I wrote a Python script to process the JHU data and convert it to JSON.
TODO