eth-mds / ricu

🏥 ICU data with R 🏥
https://eth-mds.github.io/ricu/
GNU General Public License v3.0
35 stars 10 forks source link

How to add `caregiver` concept? #36

Closed mlondschien closed 11 months ago

mlondschien commented 1 year ago

I am interested in variables that describe possible heterogeneity in the data. One such variable is cgid (caregiver-id) in table chartevents MIMIC III:

CGID is the identifier for the caregiver who validated the given measurement

(ref).

I would like to add a concept that returns, for each hour where at least one measurement was entered into the system, the list of caregivers involved. I tried the following:

  "caregiver": {
    "category": "misc",
    "sources": {
      "mimic": {
        "table": "chartevents",
        "val_var": "cgid",
        "target": "ts_tbl",
        "class": "col_itm"
      }
    }
  },

However, this raises

Error in eval(assertion, env) : 
  argument "ids" is missing, with no default

How should I go about adding such a concept?

prockenschaub commented 1 year ago

I think you just forgot to wrap the item definitions in a list [].

"caregiver": {
    "category": "misc",
    "sources": {
      "mimic": [{
        "table": "chartevents",
        "val_var": "cgid",
        "target": "ts_tbl",
        "class": "col_itm"
      }]
    }

The hint was in the traceback, where it was trying to create a sel_itm (the default) despite you specifying a col_itm. One further thing to look out for is that you currently defined this as a numeric concept. This means that by default, multiple rows with the same time stamp (= hourly bin) will be aggregated by the median. This is most likely not what you are looking for, so you will want to use unique as an aggreation.

load_concepts("caregiver", src = "mimic", aggregate = unique)

You should be able to specify this in the json too (see "dobu_dur" for an example of how) but it currently throws an error as the string is passed to dt_gforce which has a very narrow definition of what should be considered an aggregation function --- probably something we should change.

mlondschien commented 1 year ago

Thanks for the hint, also with unique :).

Followup: I want to implement a provider concept. In MIMIV-IV (2.2), the prescriptions table has a column order_provider_id I'd like to add as a ts_tbl via "index_var": "starttime". If I define

    "provider": {
        "category": "misc",
        "aggregate": "unique",
        "sources": {
            "miiv": [
                {
                    "table": "admissions",
                    "val_var": "admit_provider_id",
                    "class": "col_itm",
                    "target": "id_tbl"
                },
                ...
                {
                    "table": "poe",
                    "val_var": "order_provider_id",
                    "class": "col_itm",
                    "target": "ts_tbl"
                },
                {
                    "table": "prescriptions",
                    "val_var": "order_provider_id",
                    "class": "col_itm",
                    "target": "ts_tbl",
                    "index_var": "starttime"
                }
            ]
        }

and run ricu::load_concepts("provider", "miiv", aggregate = unique), this raises ! is_target(x = x, dat = res) is not TRUE. If I replace ts_tbl with win_tbl:

                {
                    "table": "prescriptions",
                    "val_var": "order_provider_id",
                    "class": "col_itm",
                    "target": "win_tbl",
                    "index_var": "starttime",
                    "dur_var": "stoptime"
                }

this yields

Error in colnamesInt(x, names(on), check_dups = FALSE) : 
  argument specifying columns specify non existing column(s): cols[3]='stoptime

(stoptime is a column of prescriptions).

What am I doing wrong here?

prockenschaub commented 1 year ago

I am afraid I haven't installed the MIMIC IV version 2.2 needed tor the provider IDs. However, my best guess is that you are trying to combine id_tbls and ts_tbl. The result coming from admissions is an id_tbl, whereas the others you listed are ts_tbls. The provider concept itself expects to receive ts_tbls (this is the default for concepts and can be changed by the "target" attribute, see for example "sex" or "ett_gcs").

You get the exact same error if you remove ts_to_win_tbl(mins(1L)) from the callback of AUMC's "ett_gcs" concept. You likely need another callback function id_to_ts_tbl that specifies how the id_tbl in your example should be upgraded to a ts_tbl.

mlondschien commented 1 year ago

Thanks @prockenschaub!

Together with https://github.com/eth-mds/ricu/pull/40, the following json

    "provider": {
        "category": "misc",
        "aggregate": "unique",
        "sources": {
            "miiv": [
                {
                    "table": "admissions",
                    "val_var": "admit_provider_id",
                    "class": "col_itm",
                    "index_var": "admittime",
                    "target": "ts_tbl"
                },
                ...
                {
                    "table": "poe",
                    "val_var": "order_provider_id",
                    "class": "col_itm",
                    "target": "ts_tbl"
                },
                {
                    "table": "prescriptions",
                    "val_var": "order_provider_id",
                    "class": "col_itm",
                    "target": "ts_tbl",
                    "index_var": "starttime"
                }
            ]
        }

works.

For the resulting miiv table, for 1081583 out of 5816262 rows, the index_var is negative (47508 of these from admissions). Am I correct to assume this is due to admissions / procedures from before the patient was transferred from the hospital to the ICU?

prockenschaub commented 11 months ago

Yes, if you use the ICU stay as your time origin (default), negative times relate to things that happened earlier outside of ICU.

Closing this issue now because I think it's resolved.