Open CSSEGISandData opened 4 years ago
Will recovered cases still be reported on the daily CSV files? Will they reflect the daily recovered or aggregated?
@DataChant No recovered cases will be reported in the daily reports and the time series tables.
Update: we newly added recovered time series table for most countries. Thanks!
Woah, major news. Let’s do this. Bummed about no recovered but seems to be difficult to collect. County level data is going to be massive. Thank you
Thanks for your work. I'd like to know why you won't report or provide recovered cases.
No reliable data source reporting recovered cases for many countries, such as the US.
Can you please provide us a date/time for that cutover? Can we place these new files into a different folder and leave the old files in place? This way current dashboards that we may have running won't be full of errors when the cutover happens?
Thank you, Ryan
Thanks so much! I'm making a Power BI Report now, so it's good to know about these upcoming changes!
Thanks @CSSEGISandData With respect to your second bullet point, will Province/State remain for countries (excluding US) where you can source the data?
Changes look good - thanks for all the hard work - this is a very important data set! Bevan
How do you count actice cases without having recovered available?
You don’t, just confirmed, deaths, and testing.
How do you count actice cases without having recovered available?
I'm just grouping the difference together into a group called "Active or Recovered". Like @paolinic03 said , it's the best we can do for the moment.
THANK YOU!!! :)
Will there be a release for those mentioned tables today? I don't see US tables yet.
Thank you for this. This really is an amazing resource, and I'm excited for these changes. I recommend pinning this issue so that folks don't miss it. https://help.github.com/en/github/managing-your-work-on-github/pinning-an-issue-to-your-repository
US states data is being removed. Will the us states confirmed cases and deaths be obtained by aggregating over the counties? If so, are you going to provide the state as a separate column or as part of the county name? Thanks
US states data is being removed. Will the us states confirmed cases and deaths be obtained by aggregating over the counties? If so, are you going to provide the state as a separate column or as part of the county name? Thanks
You can parse state from the county name with some quick data transformation...
In the course of the hackathon "https://wirvsvirushackathon.org/ " we are implementing a RESTful Webservice which scrapes data from various places. We have developed a landing page to make it easier for many interested people to implement a scraper or/and use our API
Maybe you can help to make the API stable and reusable! So we have to fix breaking changes only once ;-)
check it out: https://corona-api-landingpage.netlify.com/
In the course of the hackathon "https://wirvsvirushackathon.org/ " we are implementing a RESTful Webservice which scrapes data from various places. We have developed a landing page to make it easier for many interested people to implement a scraper or/and use our API
Maybe you can help to make the API stable and reusable! So we have to fix breaking changes only once ;-)
check it out: https://corona-api-landingpage.netlify.com/
Thank for this. This API's output is 2 days before. I tried this "https://corona.ndo.dev/api/daily" and it showed results until March 20th. It is good as backup.
Thanks for giving us a heads up so we may prepare for the changes. And thank you for all the work you're doing!
Can you please provide us a date/time for that cutover? Can we place these new files into a different folder and leave the old files in place? This way current dashboards that we may have running won't be full of errors when the cutover happens?
This ☝️. Please @CSSEGISandData, help us minimize the breaking changes.
The ISO code will be added in the global time series tables.
I think this is also to be added yet so the format of the global time series will change yet again.
Trying to answer my own question, this ticket mentions FIPS code will be added. Those could be county codes or state codes, I hope both, but in either case should support aggregation without unsavory brittle regex tricks.
Are you planning to include Canadian provincial data in one of the data sets?
Thanks
@rtroha it's already in there, no? Working on email reports right now, using worldwide data, but also Canadian data by provinces (I'm in Canada).
@jipiboily It's there now, but when the format changes they said they're getting rid of state data, so if that's true i'm not sure how they're going to handle Canada (I'm in the US but we have reporting needs for Canada as well).
@rtroha
Changes to the current time series include the removal of the US state and county-level entries, which will be replaced with a new single country level entry for the US.
I'm assuming getting rid of only US state data. Canada provincial data is still there in the 'global' file as of now at least.
@CSSEGISandData Thanks for continuing to make better and being transparent. I look forward to having the counts at the county level in the US.
The Living Atlas US Cases feature layer is listed as deprecated, but I believe it is still being updated and now has county level data.
Instead of changing only the time series, you broke the consitency of the data format in the daily reports. #1326
@CSSEGISandData please KEEP and UPDATE a recovered cases file. As you've consolidated all states into one "US" row, your argument no longer applies. Unbreak your repository.
@CSSEGISandData keen to understand where the change process is at?
It does not seem that the changes planned are what have been made here and as others are mentioning seems less clear as to how to make a clean data set for fine grained time series data which is what we all want.
If we want a fine grained data set do we have to mash the daily with time series - replacing the country+county/state, etc in the time series each day?
It does not seem to make sense the way this repository is progressing.
Could you please explain to us how to compile this data based on the current change and future planned.
Thanks
It does NOT make any sense to me why removed US states but still keep state-level entries in other countries (e.g., provinces in China, Canada, etc) in the "time_series_covid19_confirmed_global.csv".
@yystat Separate US time-series with county level details is coming...
Not sure why it's not there yet before deprecation of old one starts. Ideally there should be some overlap.
@yystat Separate US time-series with county level details is coming...
Not sure why it's not there yet before deprecation of old one starts. Ideally there should be some overlap.
I'm not sure why they want to keep US data in a separate file. Previously I only need to download one file, and if I want to focus on US, then I only select region==US
. Now I have to deal with 2 separate files.
What time are you planning on uploading the US time series files such as time_series_covid19_confirmed_US.csv
?
@yystat Separate US time-series with county level details is coming... Not sure why it's not there yet before deprecation of old one starts. Ideally there should be some overlap.
I'm not sure why they want to keep US data in a separate file. Previously I only need to download one file, and if I want to focus on US, then I only select
region==US
. Now I have to deal with 2 separate files.
This is a deal breaker for me. I can't believe they did this in the middle of the pandemic. I built something in a code sprint in my spare time that worked great and now it is broken and I do not have time to fix this. Please put back the US states!
And they put this in issues, not on the front page of the repo, or the readme pages for the different csv folders. I mean...my site just got up and running on Saturday and now it's basically useless.
No reliable data source reporting recovered cases for many countries, such as the US.
Just because US doesn't report recovery, does that mean the rest of the world must follow?
I really would hope the team would reconsider this, as I'm sure a lot of other folks around the world would appreciate this as well. There's no reliable recovery data for some countries (namely US), but most countries provide this valuable information. Are you also going to remove this from the interactive map too?
They said "many countries". As big as it is, the US is just one.
We all know that data out of the US is scarcely reliable due to their lack of a centralised system of reporting, so why deprive us of more reliable data from the rest of the world that has coordinated health care reporting? If this data set is now just going to be tailored to the needs of the US administration, then it is not reliable for the rest of the world.
So I see the deprecation notice in the readme and asking everyone to use the new "global" data but where is this:
The ISO code will be added in the global time series tables.
When that's added later, things will break again...
For US data, will longitude/latitude coordinates still be provided, or do I have to find a way to map FIPS to longitude/latitude?
Please keep state level data for the US. Provinces for other countries are still reported, not sure why US wouldn't be, especially considering JHU is in the US 😆
The number of recovered cases is VERY important data as with exponential growth like this the measure to look for is the growth RELATIVE to number of acrive cases. Without that knowledge, the data is I would say useless.
It's not acceptable to publish some of the data in different formats in the same commit. For example, cse_covid_19_daily_reports/03-23-2020.csv
is a different format from cse_covid_19_daily_reports/03-22-2020.csv
which is a different format from cse_covid_19_daily_reports/02-01-2020.csv
.
I can understand the schema changing over time (e.g. recovered counts). If that happens, then
That way, any code that reads the data can be consistent for each version of your repo, instead of being different for various date ranges. (And if you don't expect code to read your data, then why are you publishing it?)
Even better, publish all the data in all versions of the schema in every commit. That way the code that reads the data can remain consistent across multiple versions of your repo.
It's not acceptable to publish some of the data in different formats in the same commit. For example,
cse_covid_19_daily_reports/03-23-2020.csv
is a different format fromcse_covid_19_daily_reports/03-22-2020.csv
which is a different format fromcse_covid_19_daily_reports/02-01-2020.csv
.I can understand the schema changing over time (e.g. recovered counts). If that happens, then
- Ensure all the data in the previous commit is using the previous schema.
- Ensure all the data in the current commit is using the new schema.
That way, any code that reads the data can be consistent for each version of your repo, instead of being different for various date ranges. (And if you don't expect code to read your data, then why are you publishing it?)
Even better, publish all the data in all versions of the schema in every commit. That way the code that reads the data can remain consistent across multiple versions of your repo.
I can't believe that the data is provided by Johns Hopkins :D
Why do you push the global data already if the US data is not published? Thousands depend on this data set and you just change it the way you like from day to day without any consistency. Please be more thoughtful. This is amazing data but it's worthless if people need hours every week to rework their scripts.
Additionally in 2 weeks this data is completely useless without data for recovered. It's already useless in Chian i.e in Hubei, China there are about 60,000 recovered cases.
The number of recovered cases is VERY important data as with exponential growth like this the measure to look for is the growth RELATIVE to number of acrive cases. Without that knowledge, the data is I would say useless.
I agree with you. Without the recovered cases it's not possible to estimate the curve of the actual infected and this is a great lack for the analysis. Not possible to make predictions.....
I can understand the frustration, as another person who also keeps having to tweak code to adapt to these changes, but I'm shocked by all the complaining. The fact that we even have access to this data that they are putting days and nights of effort into gathering is an absolute gift. Just being provided access to their hard work is wonderful. Let's practice a little gratitude during these times. 🙏
No reliable data source reporting recovered cases for many countries, such as the US.
Wouldn't it be possible to provide recoveries for the countries with reliable sources and keep the rest NA ?
I've removed recovered cases from our feeds as per these changes, but I'm curious as to why recovered cases are still showing on the ArcGIS dashboard? Many thanks for all your hard work.
We will update the time series tables in the following days, aiming to provide a cleaner and more organized dataset consistent with our new/current naming convention. We will also be reporting a new variable (i.e, testing), as well as data at the county level for the US. All files will continue to be updated daily around 11:59PM UTC.
The followiing specific changes will be made:
Three new time series tables will be added for the US. The first two will be the confirmed cases and deaths, reported at the county level. The third, number of tests conducted, will be reported at the state level. These new tables will be named
time_series_covid19_confirmed_US.csv
,time_series_covid19_deaths_US.csv
,time_series_covid19_testing_US.csv
, respectively.Changes to the current time series include the removal of the US state and county-level entries, which will be replaced with a new single country level entry for the US. The tables will be renamed
time_series_covid19_confirmed_global.csv
andtime_series_covid19_deaths_global.csv
, andtime_series_covid19_testing_global.csv
, respectively.The ISO code will be added in the global time series tables.
The FIPS code will be added in the new US time series tables.
We will no longer provide recovered cases.
The current set of time series files will be moved to our archive folder, and the new files will be added to the current folder.
Thanks!
Update:
time_series_covid19_recovered_global.csv
is added.