SORMAS-Foundation / SORMAS-Project

SORMAS (Surveillance, Outbreak Response Management and Analysis System) is an early warning and management system to fight the spread of infectious diseases.
https://sormas.org
GNU General Public License v3.0
293 stars 143 forks source link

Automatic case classification for existing SORMAS diseases [13] #61

Closed StefanSzczesny closed 6 years ago

StefanSzczesny commented 7 years ago

The classification of a case should automatically be done based on defined rules. These rules in-clude a combination of specific symptoms, epidemiological data, vaccination/immunization, lab test results and more.

The definitions are provided in the SORMAS data dictionary and will be implemented with a rule system (probably not a rule engine) that combines different criteria and can be extended to be customizable by admin users later. https://docs.google.com/spreadsheets/d/1BA_RjM-ZhxFrzpAJdv0UVGhicHXNP6JnPBA-xzdrQKs/edit#gid=466434037

The classification is done whenever any of the above data is changed. It is possible to manually re-classify cases.

The relevant diseases are: EVD, Lassa fever, HPAI, CSM, Measles, Cholera, Yellow fever, Dengue Fever, Monkeypox, and Plague.

MartinWahnschaffe commented 6 years ago

@hzi-braunschweig

From my understanding the automatic case classification will need the following additional fields:

I'm not sure about the following points:

  1. What is the exact defintion of "epidemiological link" in terms of the data we are collecting in SORMAS? Is it a contact to a confirmed case? Or also a contact to a probable/suspected case?
  2. What does "suspected from a verified rumor" exactly mean ("probable" for EVD)
  3. What is the exact defintion for epidemiological risk factor (also in comparison to point 1).
  4. Is the difference between "a person with fever" (HPAI) and "a person with sudden onset of fever" (CSM)
  5. How is "detection of fourfold increase in yellow fever IgM, or IgG antibody titres between acute and convalescent serum samples, or both" represented in SORMAS? Will this need two separate samples that are compared? This would mean that we also have to enter the exact test result (yellow fever IgM).
  6. What about positive postmortem liver histopathology? Is "liver hisopathology" a new test type we need?
  7. What about the case classification rules for monkeypox and the different plague types?
MartinWahnschaffe commented 6 years ago

Result from yesterdays meeting at HZI.

I have also updated the case classifications in the data dictionary: https://docs.google.com/spreadsheets/d/1BA_RjM-ZhxFrzpAJdv0UVGhicHXNP6JnPBA-xzdrQKs/edit#gid=466434037

Case Data

Symptoms

Epi Data

Samples

Sample Tests

Questions

  1. Adding the "No (additional) sample can be taken" to the samples section in the app would be a lot of work and changes to the existing design. On the other side adding this checkbox to another section (e.g. case data) would not make a lot of sense. I'm wondering whether we really need this checkbox. An alternative would be to simply use "Deceased suspected case having an epi-link to a confirmed case" as the rule for "probable".
  2. For CSM I the classification had no definition for the duration of the incubation period (used for epi link in probable cases). I have used the 10 days that are defined in SORMAS, okay?
  3. We don't have a symptom "Bloody diarrhea". Instead there is a symptom for "Diarrhea" and a symptom for "Blood in stool". I'd interpret the combination of both symptoms as bloody diarrhea for EVD suspected cases, okay?
  4. For dengue we currently don't have "probable". I was wondering whether the combination of a suspected case (not necessarily deceased) with "exposure to disease infected neighborhood within incubation period" would make sense?
  5. We don't have "severe dehydration" (Cholera suspected). Instead only "dehydration" - used for EVD, Lassa and Cholera. Should we (A) rename this field, (B) use it for the classification as is or (C) add a second field?
  6. For Lassa suspected the rule contains "contact with excreta or urine of rodents". In the epi data section we already have "Contact with: Rodents or their excreta". I expect we can use this?
  7. "Absence of yellow fever immunization within 30 days before onset": We currently have a field for vaccination and vaccination date. How does this translate? Must the vaccination take place before the 30 days before onset?
  8. Finally the influence epi link part. This one is tricky:

This is the original set of epi links for HPAI:

a. Close contact (within 1 meter) with a person who is a suspected, probable, or confirmed HPAI case; b. Exposure (e.g. handling, slaughtering, de-feathering, butchering, preparation for consumption) to poultry or wild birds or their remains or to environments contaminated by their faeces in an area where H5N1 infections in animals or humans have been suspected or confirmed in the last month c. Consumption of raw or undercooked poultry products in an area where H5N1 infections in animals or humans have been suspected or confirmed in the last month d. Close contact with a confirmed H5N1 infected animal other than poultry or wild birds e. Handling samples (animal or human) suspected of containing H5N1 virus in a laboratory or other setting

Result from yesterday and my thoughts: a. Same room with a probable or confirmed case b. Exposure (e.g. handling, slaughtering, de-feathering, butchering, preparation for consumption) to animals (or their remains) likely to have been infected by the disease c. Consumption: Not sure how to change this one. d. and e. This would be no longer necessary (see b.), right?

Besides this question, this is what we are currently asking for in SORMAS (see screenshot attached). I'm wondering whether we need all of this / what should be used/changed for the epi links above.

hpai_epidata

"Please indicate an answer regarding ALL animals (live or dead) the person had direct exposure to during the incubation period."

  • Eating raw or undercooked poultry
  • Exposure to poultry or domesticated birds
  • Expose to sick/unexplained dead poultry/other domest. birds
  • other animals
  • exposure to wild birds
MartinWahnschaffe commented 6 years ago

Answers

  1. OK. As long as this rule is only applicable to deceased suspect cases, it may be acceptable to simplify as suggest.
  2. OK
  3. OK
  4. Please rephrase to “exposure to neighborhood where confirmed cases occurred within incubation period”
  5. Simply use “dehydration” throughout , without the adjective “severe”
  6. OK
  7. OK
  8. Use the following rules:
    • Close contact (within 1 meter) with a probable or confirmed case
    • Exposure (e.g. handling, butchering) to animals (or their remains) in an area where infected animals have been confirmed in the past month
    • Consumption of raw or undercooked animal products in an area where infected animals have been confirmed in the past month
    • Handling samples (animal or human) suspected of containing the virus without adequate personal protection

In addition:

MartinWahnschaffe commented 6 years ago

Maté and have discussed my idea of a case classification log. A problem with this would be that it could sometimes become quite big and result in unnecessary data this needs to be transfered to the mobile app.

From our point of view the essential data for a mobile user is:

If it becomes necessary to look into the full history of classification changes this could be implemented in the web app at some point based on the data from the history tables.

MartinWahnschaffe commented 6 years ago

For comment:

Testing:

Add classifications for

MartinWahnschaffe commented 6 years ago

Three possible rule engines:

  1. Drools (https://www.drools.org/): Uses a script language for the rule definitions. Comes with a whole set of tools and also a process engine. Here is a short introduction article: https://medium.com/@ryanjollyyoung/why-should-i-use-drools-ba80be3b5311
  2. OpenRules uses Excel for rule files. I couldn't find the source code though and the documentation looks very old-schoolish.
  3. "Decision Model and Notation" (DMN): A standard of the OMG (https://en.wikipedia.org/wiki/Decision_Model_and_Notation, https://docs.camunda.org/manual/7.9/reference/dmn11/decision-table/). Uses decision tables that are stored in an XML format. Flowable (java business process engine: https://www.flowable.org/) comes with a rule engine for this.

At the current point I wouldn't suggest to add a rule engine, though. It would simply add a lot of overhead to the whole thing. The way to go is to put the logic for the automatic case classification in its own class and give it a clean interface. If needed this can be easily replaced at a later point in time.

The most important part will be unit tests that cover all possible rule paths.

MateStrysewske commented 6 years ago

@hzi-braunschweig Is "Malaise" equivalent to "Fatigue/Weakness"? -> Yes New Influenca: Which test type is meant with "neutralization antibody test"? -> New test type needed, see #780

MartinWahnschaffe commented 6 years ago

Adjust case classification visual presentation:

grafik

Fix bugs (tested on symeda server):

MateStrysewske commented 6 years ago

I can't reproduce the bugs on the symeda server.