cmu-delphi / covidcast-indicators

Back end for producing indicators and loading them into the COVIDcast API.
https://cmu-delphi.github.io/delphi-epidata/api/covidcast.html
MIT License
12 stars 17 forks source link

Consider redefining COVID-like illness and updating Qualtrics surveys to match #55

Open capnrefsmmat opened 4 years ago

capnrefsmmat commented 4 years ago

CDC guidance on common COVID symptoms has changed; @ryantibs says fever is no longer considered important, whereas fatigue and loss of smell/taste are.

We should

cc @RoniRos

RoniRos commented 4 years ago

Alas there is no one universal definition. Here is a start (from https://www.cdc.gov/coronavirus/2019-ncov/symptoms-testing/symptoms.html): "People with these symptoms may have COVID-19: Fever or chills Cough Shortness of breath or difficulty breathing Fatigue Muscle or body aches Headache New loss of taste or smell Sore throat Congestion or runny nose Nausea or vomiting Diarrhea This list does not include all possible symptoms. CDC will continue to update this list as we learn more about COVID-19."

Sensitivity and specificity vary greatly among these symptoms. Fever used to be considered an "almost must" (89% in early estimates), but later was downgraded significantly. "New loss of taste or smell" is highly specific (probably the most specific against all respiratory pathogens -- none of the others is known to cause it!), but is not super sensitive (many covid+ people don't have it).

There is also great variability by age group. Children are dramatically skewed towards the gastrointenstinal symptoms (which is why they weren't detected in the early days).

This is going to be a HUGE issue in the Fall, when all respiratory illnesses come back, and covid is likely to re-surge as well. Developing a good probabilistic classifier is highly desirable. I am working on it with AHN. I am also hopeful that access to rich data with some of our data providers will allow us (Delphi) to build very good classifiers.

krivard commented 4 years ago

Investigate base rates of symptoms we're already measuring by comparing rate of symptom reports in counties with low vs high COVID-19 activity as determined by other signals.

Cross-reference with data from healthcare partners?

RoniRos commented 4 years ago

@drnigam shared a very relevant spreadsheet from his lab, showing priors and log odds of various symptoms.

brookslogan commented 4 years ago

Another thing to consider when redefining symptom sets: accuracy vs. timeliness (and sensitivity vs specificity and what level of illness to focus on).

If we want to increase lead on the indicators, then filtering based on all CLI criteria may be too much; some of these symptoms may not show up until later in the progression of such illness, when they may have already interacted with the health care and/or testing systems. Additionally, descriptions of ordering/presence of symptoms may focus on more serious cases.

On the other hand, we may want to require a stricter definition of CLI to try to get CLI vs. ILI to help separate actual COVID-19 and non-COVID-19 activity.

RoniRos commented 4 years ago

So based on Logan's argument, maybe the goal is to build a classifier based on symptoms that occur prior to being lab-confirmed. This requires significant data about symptom onset dates, combined with eventual test results (or failure to test). Maybe the CHC line level data could be used for this.

See a very relevant recent publication by Nigam et al.

krivard commented 4 years ago

I thought we couldn’t get test results from CHC?

On Sun, Aug 30, 2020 at 4:32 PM RoniRos notifications@github.com wrote:

So based on Logan's argument, maybe the goal is to build a classifier based on symptoms that occur prior to being lab-confirmed. This requires significant data about symptom onset dates, combined with eventual test results (or failure to test). Maybe the CHC line level data could be used for this.

See a very relevant recent publication by Nigam et al. https://www.nature.com/articles/s41746-020-0300-0

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/cmu-delphi/covidcast-indicators/issues/55#issuecomment-683467209, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAI24CUF5CN3F5ATXUNJBBLSDKZOJANCNFSM4NNNXKNQ .

RoniRos commented 4 years ago

Correct. But we do get HCPCS codes, which are allegedly equivalent. We still don't get the test results. The ICD code U07.1 is supposedly equivalent to a positive test result, but if the result arrives late, and there is no more billing, a code may not be generated. But a person with a positive test result may have subsequent covid-related interactions, in which U07.1 should be mentioned. My hope is that we will be able to get a picture of what's happening in practice from following a particular patient across time. But that's only a hope at this point.