CSSEGISandData / COVID-19

Novel Coronavirus (COVID-19) Cases, provided by JHU CSSE
https://systems.jhu.edu/research/public-health/ncov/
29.14k stars 18.44k forks source link

Where can I get longitudinal data of the COVID-19? #1655

Open skanskan opened 4 years ago

skanskan commented 4 years ago

Hello.

Where can I download a full dataset of the SARS-CoV-2 cases containing time information for each individual (date of infection, date of death, age, gender, weight, censoring information, date recovered…)

It’s important to get this information in order to do survival analyses and other complex studies.

I’m only able to find it aggregated, but I need details about each case.

cipriancraciun commented 4 years ago

(Not affiliated with the project.)

I don't think this kind of dataset actually exists, or can exist due to logistic reasons.

For example in Romania, up to 200 cases or so, a group of volunteers kept a per individual record as you were asking for. (It wal built based on news reports.) (It was exposed via an API.)

However this became impossible to maintain, and now they removed even that API.

(Why I'm saying it's practically impossible to create it due to logistic reasons: because at the moment in the whole world there seem to be at most a handful of aggregated datasets, like JHU's, and none of them are actually "accurate" enough...)

skanskan commented 4 years ago

Even small subsets would be good

cipriancraciun commented 4 years ago

@skanskan , on #1250 @harish0412 provided a link to a dataset for India containing individual users data (around 800 or so individuals):

harish0412 commented 4 years ago

Ya those are missing. If we use tableau we can get the latitude and longitude datas I think. I'm not sure.

On Sat, 28 Mar, 2020, 12:28 am Ciprian Dorin Craciun, < notifications@github.com> wrote:

@skanskan https://github.com/skanskan , on #1250 https://github.com/CSSEGISandData/COVID-19/issues/1250 @harish0412 https://github.com/harish0412 provided a link to a dataset for India containing individual users data (around 800 or so individuals):

- https://docs.google.com/spreadsheets/d/e/2PACX-1vSc_2y5N0I67wDU38DjDh35IZSIS30rQf7_NYZhtYYGU1jJYT6_kDx4YpF-qw0LSlGsBYP8pqM_a1Pd/pubhtml

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/CSSEGISandData/COVID-19/issues/1655#issuecomment-605213694, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF42JLQEA2OYLNA6JAGKYUTRJTZOZANCNFSM4LUUT6GA .

cipriancraciun commented 4 years ago

@harish0412 the original poster by "longitudinal" wasn't referring to geographical coordinates, but instead to a dataset that instead of summarizing cases, it presents them one-by-one, so one can create statistical models to identify correlations, etc.

skanskan commented 4 years ago

@harish0412 the original poster by "longitudinal" wasn't referring to geographical coordinates, but instead to a dataset that instead of summarizing cases, it presents them one-by-one, so one can create statistical models to identify correlations, etc.

Yes, I'm sorry for the confusion, some people call it longitudinal data. I mean data with detailed time information on an individual basis (date of diagnosis, date of discharge, date of death, date of other changes, treatment...). This information is needed to perform survival analysis and other complex studies.