datasets / covid-19

Novel Coronavirus 2019 time series data on cases
https://datahub.io/core/covid-19
1.16k stars 604 forks source link

Canada Recovery Data #47

Closed RKattoula closed 4 years ago

RKattoula commented 4 years ago

Not seeing recovery data for Canada, but it is being updated in the John Hopkins data.

Those are the only NA's I'm seeing. Great work on this - thanks a ton.

anuveyatsu commented 4 years ago

@RKattoula thanks for reporting this. It's happening due to inconsistency in the data files from the upstream:

Any suggestion would be useful 😄

sympatheia-one commented 4 years ago

Suggestions: There seem to be at least 3 options: (1) leave out to reflect the source and see if they will eventually update (2) "allocate" the total # of Recovered to the various provinces (for example in relation to their confirmed cases) or (3) just push the Revovered cases for Canada into a new set of rows where the Province/Region is labelled as something non-existing like RECOVERED AGGREGATE and which then will just have the Recovered numbers.

(1) is obviously the easiest... (2) is inventing numbers that don't really exist and may not be useful for some (3) [my preference] at least on a sum-per-country level we are back to the full picture

anuveyatsu commented 4 years ago

@sgg70 option 3 sounds good to me

anuveyatsu commented 4 years ago

Fixed. Please, check and close this issue.

@RKattoula @sgg70

sympatheia-one commented 4 years ago

hmm... just one question: image looking at the raw file, there is now a "Recovery" block with no real value, and then the "Recovery Aggregated", as suggested. I guess that first part can be omitted!?

anuveyatsu commented 4 years ago

@sgg70 yes, the first one is the buggy data from the upstream that I've missed. Should be fixed now.