CSSEGISandData / COVID-19

Novel Coronavirus (COVID-19) Cases, provided by JHU CSSE
https://systems.jhu.edu/research/public-health/ncov/
29.13k stars 18.43k forks source link

Power Query solution to get daily Active Cases #1548

Open kazinad opened 4 years ago

kazinad commented 4 years ago

This can be your starting point in Power BI to get daily Active Cases after the changes described in #1250. My solution is to process daily reports, not the time series files:

let
    Source = Web.Page(Web.Contents("https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_daily_reports")),
    Data0 = Source{0}[Data],
    #"Select Name Column" = Table.SelectColumns(Data0,{"Name"}),
    #"Filter to .csv File Names" = Table.SelectRows(#"Select Name Column", each Text.EndsWith([Name], ".csv")),
    #"Get Each File Contents" = Table.AddColumn(#"Filter to .csv File Names", "Contents", each Csv.Document(Web.Contents("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_daily_reports",  [RelativePath=[Name]]))),
    #"Promote Headers" = Table.AddColumn(#"Get Each File Contents", "Promote Headers", each Table.PromoteHeaders([Contents])),
    #"Expanded Promote Headers" = Table.ExpandTableColumn(#"Promote Headers", "Promote Headers", {"Province/State", "Country/Region", "Last Update", "Confirmed", "Deaths", "Recovered", "Latitude", "Longitude", "FIPS", "Admin2", "Province_State", "Country_Region", "Last_Update", "Lat", "Long_", "Active", "Combined_Key"}, {"Province/State", "Country/Region", "Last Update", "Confirmed", "Deaths", "Recovered", "Latitude", "Longitude", "FIPS", "Admin2", "Province_State", "Country_Region", "Last_Update", "Lat", "Long_", "Active", "Combined_Key"}),
    #"Removed Contents" = Table.RemoveColumns(#"Expanded Promote Headers",{"Contents"}),
    #"Changed Types1" = Table.TransformColumnTypes(#"Removed Contents",{{"Province/State", type text}, {"Country/Region", type text}, {"Last Update", type datetime}, {"Confirmed", Int64.Type}, {"Deaths", Int64.Type}, {"Recovered", Int64.Type}}),
    #"Replace Nulls in Confirmed" = Table.ReplaceValue(#"Changed Types1",null,0,Replacer.ReplaceValue,{"Confirmed"}),
    #"Replace Nulls in Deaths" = Table.ReplaceValue(#"Replace Nulls in Confirmed",null,0,Replacer.ReplaceValue,{"Deaths"}),
    #"Replaced Nulls in Recovered" = Table.ReplaceValue(#"Replace Nulls in Deaths",null,0,Replacer.ReplaceValue,{"Recovered"}),
    #"Filter Confirmed > 0" = Table.SelectRows(#"Replaced Nulls in Recovered", each [Confirmed] > 0),
    #"Add ActiveCalculated" = Table.AddColumn(#"Filter Confirmed > 0", "ActiveCalculated", each [Confirmed] - [Deaths] - [Recovered]),
    #"Changed Types2" = Table.TransformColumnTypes(#"Add ActiveCalculated",{{"ActiveCalculated", Int64.Type}, {"Last_Update", type datetime}}),
    #"Add CountryRegionCalculated" = Table.AddColumn(#"Changed Types2", "CountryRegionCalculated", each List.Last(List.Select({[#"Country/Region"], [Country_Region]}, each _ <> null))),
    #"Add Date" = Table.AddColumn(#"Add CountryRegionCalculated", "Date", each Date.FromText(Text.Middle([Name], 1, 10))),
    #"Changed Types3" = Table.TransformColumnTypes(#"Add Date",{{"Date", type date}}),
    #"Grouped Rows" = Table.Group(#"Changed Types3", {"Date", "CountryRegionCalculated"}, {{"Active", each List.Sum([ActiveCalculated]), type number}}),
    #"Renamed Columns" = Table.RenameColumns(#"Grouped Rows",{{"CountryRegionCalculated", "Country/Region"}}),
    #"Changed Types4" = Table.TransformColumnTypes(#"Renamed Columns",{{"Active", Int64.Type}, {"Country/Region", type text}}),
    #"Filter Active > 0" = Table.SelectRows(#"Changed Types4", each [Active] > 0)
in
    #"Filter Active > 0"

You can find the sample on the second page of my Power BI report: Covid-19 Power BI Report

desaian commented 4 years ago

Does anyone have python code to create a custom time series csv from the raw data? It's more difficult to manipulate the way it is set up now.

mannby commented 4 years ago

I use the following code to generate the time series data for https://www.c19vir.us

States.zip

desaian commented 4 years ago

Thanks!

Sent from Yahoo Mail on Android

On Wed, Mar 25, 2020 at 1:35 PM, Claes-Fredrik Mannbynotifications@github.com wrote:

I use the following code to generate the time series data for https://www.c19vir.us

States.zip

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.