cmu-delphi / delphi-epidata

An open API for epidemiological data.
https://cmu-delphi.github.io/delphi-epidata/
MIT License
101 stars 63 forks source link

Add NSSP data as an endpoint #558

Open nickreich opened 3 years ago

nickreich commented 3 years ago

https://covid.cdc.gov/covid-data-tracker/#ed-visits

krivard commented 3 years ago

Nice! This particular link looks like it would be compatible with the covidcast schema, but NSSP provides way, way more than that through ESSENCE. Are you just looking for the COVID-related ED visits, or something more detailed that would require its own endpoint?

nickreich commented 3 years ago

I'm not clear on the "endpoint" terminology, but yes, it would be nice to have multiple different signals (COVID-related ED visits, flu-related ED visits, etc...). This has come up recently as a possible data source to use for future FluSight challenges, in conversations with CDC, UMass, and CMU Delphi folks.

krivard commented 3 years ago

Ah sorry, by "endpoint" I mean the following: Each endpoint of the API returns data in a different format. For example, the covidcast endpoint is largely designed for sample-based estimates, and includes only the value, stdev, and sample size in each row returned; by contrast, the covid_hosp endpoint returns over 60 different fields in each row, most of them raw counts; the fluview endpoint returns 5 different fields plus an age-stratified count.

Making a new endpoint means we can support whatever output format we want, but you have to use the delphi_epidata client to access the data, which is rather bare-bones at the moment. Adding it into the covidcast endpoint restricts the output format, but you can use the covidcast client libraries to access the data, which do things like automatically format the results as a data frame, plot choropleth maps, and compute correlation analyses.

If you need a small number of time series (like <5?) or you don't know exactly which ones you want yet, then adding this into covidcast is the most expedient.

So far I have:

Anything else?

nickreich commented 3 years ago

mentioning here that I've had an offline discussion with @ryantibs about this. I'm honestly not sure exactly how many time-series might be useful from the larger NSSP/ESSENCE systems. This may tie into future plans for using these data sources for future flu modeling efforts as well.

brookslogan commented 3 years ago

TL;DR: I can't find any regularly updated NSSP ILI or influenza time series data. That would be an issue for potential usage for flu forecasting.

There are multiple ways that NSSP data extracts are currently published; here are three:

The state&agegroup-level "diagnosed COVID-19" series is the only one I have seen with ongoing updates. If this is to be used for ILI situational awareness or forecasting, there will need to be regularly published/shared ILI time series as well.

nickreich commented 5 months ago

I want to strongly bump this up as a request for hoovering these data into the EpiData API. It came up again today in a conversation with CDC as these data may serve as future modeling/forecasting targets. I see two related datasets:

melange396 commented 2 months ago

This may be achieved, at least partially, by https://github.com/cmu-delphi/covidcast-indicators/pull/1952