CSSEGISandData / COVID-19

Novel Coronavirus (COVID-19) Cases, provided by JHU CSSE
https://systems.jhu.edu/research/public-health/ncov/
29.12k stars 18.41k forks source link

Python Code to Reformat New Format to Older Format #1458

Open hippodribble opened 4 years ago

hippodribble commented 4 years ago

I see they changed the format. The following code puts a "new" file into the "old" format.

Process

  1. Read the file
  2. Swap and rename columns
  3. Write the new file

The function writes to the same folder with a 'b' in the filename as the last character before the ".csv"

Call it with the full file path of the file in the "new" format

Cheers

Glenn

import os.path
import pand as pd
def reformat(infile):
    (dir, file) = os.path.split(infile)
    f = file.replace(".csv","b.csv")
    outfile = os.path.join(dir,f)
    print(infile)
    print(outfile)
    df1 = pd.read_csv(infile, parse_dates=[4],skip_blank_lines=False )
    dic = {
        "Province/State":"Province_State",
        "Country/Region":"Country_Region",
        "Last Update":"Last_Update",
        "Confirmed":"Confirmed",
        "Deaths":"Deaths",
        "Recovered":"Recovered",
        "Latitude":"Lat",
        "Longitude":"Long_"
    }
    temp=df1[list(dic.values())]
    temp.columns=list(dic.keys())
    temp.reset_index(drop=True, inplace=True)
    temp.to_csv(outfile, index=False)
thomasXwang commented 4 years ago

Thank you, it's very useful!

hippodribble commented 4 years ago

You're welcome. I'm locked in for a few weeks, so there's not much else to do but read the numbers...