MIT-LCP / mimic-code

MIMIC Code Repository: Code shared by the research community for the MIMIC family of databases
https://mimic.mit.edu
MIT License
2.63k stars 1.53k forks source link

ICUSTAY_IDs with set tidal volume recording but no MV billing code. Why? #25

Closed ayoung01 closed 8 years ago

ayoung01 commented 9 years ago
  1. Why are billing codes more reliable? At least 29% of the ICU stays I have TV data for do not have a ventilation billing code recorded during the duration of the ICU stay. Does it make sense to exclude these ICU stays?

Please double check this. This is not good at all if accurate. There are two possibilities I can think of to explain this is that the patients with TV data but no billing codes for MV were on NIV. The first is that MV in the operating room has a different billing code and this was not given to us. Patients who had open heart surgery would have this billing code and not the one that we have. The other possibility is that sometimes, TV is recorded when patients are on NIV even though these measurements are highly inaccurate. Can we check if they have billing codes for NIV? If billing codes indeed do not capture if patients were on MV, we will have to abandon it. We should review the discharge summaries of a sample of the patients with TV but no billing codes for MV. We should have a conference call once we've done these checks.

I rechecked my work, and 3857/14759 (26.1%) of ICUSTAY_IDs with a set TV recording have no billing code for IMV. Including NIV billing codes, 3701/14759 (25.1%) of ICUSTAY_IDs with a set TV are still unaccounted for. Next, if I subset the set TV recordings by AC/CMV only, a smaller proportion - 1424/8749 (16.3%) - of ICUSTAY_IDs are missing MV billing codes. On the flip side, every ICUSTAY_ID with a MV billing code has at least 1 set TV recording.

I've mapped billing codes to ICUSTAY_IDs based on whether the billing CHARTDATE was made on or after INTIME and on or before OUTTIME of the ICU stay. I've lost 4% of the MV billing codes this way, but it's not enough to explain these results.

Here you may review the discharge notes of the patients with a set TV recording but no IMV billing code. There's a handful of duplicate notes, for multiple ICU stays with MV within the same hospital admission.

  1. Billing codes only have a time resolution of days, while we discussed a time resolution of hours for recording TV. To calculate duration of MV, I think it might make sense to validate the MV through the billing code and calculate duration starting from the first TV recording to avoid this unnecessary loss of resolution. Else, should I take the time difference between the CHARTTIME for TV and the CHARTDATE for the first MV billing code?

The way we calculate duration of MV previously is based on the first and last recording of MV settings. The first recording of MV settings is reliable as the start of MV but the last recording is questionable as MV settings may remain the same during the last few days of MV and may not be recorded. For the patients who have billing codes for MV, can we please check if the last day of billing for MV correspond to the last day MV settings were recorded? We should have a conference call once we've done these checks.

For patients who have billing codes for MV, the last day of billing for MV corresponds to the last day MV settings were recorded for 8303/11058 (75.1%) of the ICUSTAY_IDs. Of those that do not correspond, 2457/11058 (22.2%) have a set TV recording after the last day of MV billing, and 298/11058 (2.7%) have a set TV recording before the last day of MV billing.

To get a sense of the TV data, I've calculated duration of MV as suggested above. Here are the set TV recordings, named by _ICUSTAYID.txt. The TIME column is the number of hours after the first MV recording. Let me know if this looks good and I can reformat it to what we discussed.

lceli commented 9 years ago

Great job, Albert!I guess we can't rely on the billing codes to identify patients who received mechanical ventilation.We would have to stick with our previous way of looking at ventilatory settings to identify this cohort.For the time of extubation, we need to combine information from both the billing codes and ventilatory settings. Also since MetaVision was adopted, we are required to enter "extubate" as orders. Can someone look at the CPOE table for this order? How good is it in providing time of extubation?In addition, we should look for entry for other modes of oxygenation after ventilatory settings are no longer recorded. Typically, patients are put on nasal cannula, face mask, and/or hood. It is very rare to get extubated and not be given some oxygen immediately after.We need to combine all these methods to approximate the time of extubation.

Leo

alistairewj commented 9 years ago

Could you post the code that you have used to identify whether a patient is on mechanical ventilation using the billing codes?

ayoung01 commented 9 years ago

Could you post the code that you have used to identify whether a patient is on mechanical ventilation using the billing codes?

Sorry for the delay. I downloaded all the CSV files and am using R to query the data. Here is the relevant portion of my code:

cptevents <- fread("~/Documents/data/CPTEVENTS_DATA_TABLE.csv")
cpt.resp <- subset(cptevents, COSTCENTER=="Resp")
cpt.resp.mv <- subset(cpt.resp, DESCRIPTION%in%c("VENT MGMT, 1ST DAY (INVASIVE)", "VENT MGMT;SUBSQ DAYS(INVASIVE)"))

Can someone look at the CPOE table for this order? How good is it in providing time of extubation?In addition, we should look for entry for other modes of oxygenation after ventilatory settings are no longer recorded. Typically, patients are put on nasal cannula, face mask, and/or hood. It is very rare to get extubated and not be given some oxygen immediately after.We need to combine all these methods to approximate the time of extubation.

I couldn't find any record of extubation (i.e. ITEMID in (225468, 225477, 227194)) in IOEVENTS or CHARTEVENTS--do you have any idea where else I might look?

alistairewj commented 9 years ago

I've looked into this extensively. It does appear that there is some kind of billing code assigned to patients who come already ventilated to the ICU. A large number of patients are ventilated on admission and weaned, but we have no CPT code for them.

I've mapped billing codes to ICUSTAY_IDs based on whether the billing CHARTDATE was made on or after INTIME and on or before OUTTIME of the ICU stay. I've lost 4% of the MV billing codes this way, but it's not enough to explain these results.

Before I start, I should note that you have to truncate INTIME to the day. Otherwise, a CHARTDATE of 2167-01-01 00:00 will occur before an INTIME of 2167-01-01 07:00, even though these correspond to the same day. Perhaps you will gain more agreement with CPT codes by adding in this fix (I do not believe you will get complete agreement).

Back to it, let's examine CPT codes. If we define ventilation using vent settings, we can look into patients who we think are ventilated but who do not have an associated CPT code. First off, it's important to isolate out neonates, who are not assigned CPT codes.

Column CPT Count Vent flag count
Number of patients 46476 46434
... with vent code 17204 23826
... ... neonate 0 1844
... ... adult 17204 21982
Number of ICU stays 61532 60942
... with vent code 20129 27736
... ... neonate 0 1866
... ... adult 20129 25870

Next, we can examine the day of extubation. We compare the number of patients who are missing a CPT code but who have ventilator settings (middle column) with patients who have a CPT code (right column).

Extubation (days since admission) Vented ICU stay missing CPT ICU stay with CPT code
0 3991 5964
1 1202 15712
2 314 9842
3 170 7763
4 102 6500
5 79 5608
6 63 4831

The distribution among patients missing CPT codes seems quite different to those patients with CPT codes. A high proportion of patients (63.57%) who are missing CPT codes are extubated on day 0 (i.e. they are only ventilated on the day of ICU admission), where as most patients who have CPT codes are ventilated for more than 1 day and extubated much later (only 6.1% are extubated on the same day of ICU admission). A likely explanation is that many of these patients are admitted from the OR and rapidly weaned from the ventilator, and that the OR does not use the billing codes that we have received.

I think the solution is to use vent settings (including tidal volume, PEEP, etc) to define ventilation.

Can someone look at the CPOE table for this order? How good is it in providing time of extubation?In addition, we should look for entry for other modes of oxygenation after ventilatory settings are no longer recorded. Typically, patients are put on nasal cannula, face mask, and/or hood. It is very rare to get extubated and not be given some oxygen immediately after.We need to combine all these methods to approximate the time of extubation.

I couldn't find any record of extubation (i.e. ITEMID in (225468, 225477, 227194)) in IOEVENTS or CHARTEVENTS--do you have any idea where else I might look?

I believe Leo was referring to the provider order entry database - which is sourced from the hospital database and different than IOEVENTS/CHARTEVENTS. However, as far as I know we don't have any information related to procedures from POE, only data related to medications.

To expand a bit on your question about the ITEMIDs, this is a recent fix that you may not have since we only released v1.2 last week. There is now a table called PROCEDUREEVENTS_MV, and data for those ITEMIDs will be found there. The good news is that this table also contains an ID for ventilation with start/stop times. The downside is that it only exists for Metavision patients (i.e. half the database). I think the best approach is to trial a ventilation settings query and see how sensitive it is compared to (i) CPT codes and (ii) the PROCEDUREEVENTS_MV table. That's what I'm currently working on now.

alistairewj commented 8 years ago

Code to determine the duration of mechanical ventilation can be found here:

https://github.com/MIT-LCP/mimic-code/blob/master/etc/ventilation-durations.sql

I did a sensitivity analysis on using the CPT codes as a "ground truth" - here is the query/results (note, it requires you do have run the ventdurations.sql query and have the ventdurations materialized view):

with cpt as
(
select ie.SUBJECT_ID, ie.HADM_ID, ie.ICUSTAY_ID
, cpt.chartdate as cptdate
from mimiciii.icustays ie
left join mimiciii.cptevents cpt
    on ie.hadm_id = cpt.hadm_id
    and date_trunc('day',ie.intime) <= cpt.chartdate and ie.outtime >= cpt.chartdate
    and COSTCENTER = 'Resp'
    and DESCRIPTION in ('VENT MGMT, 1ST DAY (INVASIVE)', 'VENT MGMT;SUBSQ DAYS(INVASIVE)')
group by ie.SUBJECT_ID, ie.HADM_ID, ie.ICUSTAY_ID, ie.intime, ie.outtime, cpt.chartdate
)
-- define an integer which is the maximum amount of time a patient ever stayed in the ICU
, addeddays as
-- 1; 2; 3; ... ; MAXIMUM_ICU_STAY_IN_DAYS
(
select generate_series(1, cast(floor(max( extract(DAY FROM outtime - intime + (interval '1d')) )) as integer) ) * (interval '1d')
    as dayoffset
from mimiciii.icustays
)
, ietbl as
(
  -- join the admissions table against the integer for each possible day
  select ie.subject_id, ie.hadm_id, ie.icustay_id, ie.intime, ie.outtime
  , date_trunc('day',intime)+dayoffset as currday
  from mimiciii.icustays ie
  -- join against the integer as long as the integer is less than the LOS
  -- this creates a date for each day of the patient's admission
  left join addeddays
    on dayoffset <= outtime-intime+(interval '1d')
)
-- aggregate ventilation durations into days
-- note that we create a row for *every day* a patient is in the ICU
-- this allows us to evaluate the sensitivity/specificity of the two approaches
, ventdays as
(
  select
    ie.subject_id, ie.hadm_id, ie.icustay_id
    , ie.currday as chartdate
    , max(case when ve.icustay_id is not null then 1 else 0 end) as vent
  from ietbl ie
  left join ventdurations ve
    on  ie.icustay_id = ve.icustay_id
    and ie.currday between date_trunc('day',ve.starttime) and ve.endtime
  group by ie.subject_id, ie.hadm_id, ie.icustay_id, currday
)
, tt as
(
select
  coalesce(ce.subject_id, cpt.subject_id) as subject_id
  , coalesce(ce.hadm_id, cpt.hadm_id) as hadm_id
  , coalesce(ce.icustay_id, cpt.icustay_id) as icustay_id
  , ce.vent as vent_ce
  , case when cpt.cptdate is not null then 1 else 0 end as vent_cpt
  , coalesce(ce.chartdate, cpt.cptdate) as chartdate
from ventdays ce
left join cpt
  on ce.icustay_id = cpt.icustay_id and ce.chartdate = cpt.cptdate
)
select
  count(tt.subject_id) as NumObs
  , round(100.0*sum(case when vent_ce = 1 and vent_cpt = 1 then 1 else 0 end) / sum(case when vent_cpt = 1 then 1 else 0 end),2)  as TP
  , round(100.0*sum(case when vent_ce = 0 and vent_cpt = 0 then 1 else 0 end) / sum(case when vent_cpt = 0 then 1 else 0 end),2)  as TN
  , round(100.0*sum(case when vent_ce = 1 and vent_cpt = 0 then 1 else 0 end) / sum(case when vent_cpt = 0 then 1 else 0 end),2)  as FP
  , round(100.0*sum(case when vent_ce = 0 and vent_cpt = 1 then 1 else 0 end) / sum(case when vent_cpt = 1 then 1 else 0 end),2)  as FN

  , round(100.0*sum(case when vent_ce = 1 and vent_cpt = 1 then 1 else 0 end) / count(tt.subject_id),2)  as AgreePos
  , round(100.0*sum(case when vent_ce = 0 and vent_cpt = 0 then 1 else 0 end) / count(tt.subject_id),2)  as AgreeNeg
  , round(100.0*sum(case when vent_ce = 1 and vent_cpt = 0 then 1 else 0 end) / count(tt.subject_id),2)  as DisagreePos
  , round(100.0*sum(case when vent_ce = 0 and vent_cpt = 1 then 1 else 0 end) / count(tt.subject_id),2)  as DisagreeNeg
from tt
-- filter to only adult patients
inner join patients pat
  on tt.subject_id = pat.subject_id
inner join admissions adm
  on tt.hadm_id = adm.hadm_id
  and adm.admittime > pat.dob + interval '1' year;
Measure Value
Number of days examined 248,295
Agreed - Ventilated 35.81%
Agreed - Not ventilated 56.20%
Disagreed - CPT says not vent 6.82%
Disagreed - No vent settings, but CPT billed 1.17%

If we take CPT to be a ground truth, we can calculate operating point statistics:

Measure Value
Number of days examined 248,295
True Positives 96.83%
True Negatives 89.19%
False Positives 10.81%
False Negatives 3.17 %

True positives we are happy with. True negatives we are happy with. False positives we expect because we have good reason to believe CPT codes do not capture patients admitted from the OR who are ventilated. The FN should be as close to 0 as possible: there is no reason a patient should be billed for ventilation without vent settings.

Investigating the patients one by one provides insight into the false negatives.

Case Discrepancy reason Details
1 Quantization Patient extubated between midnight and 4am - no vent settings updated in that time, so flag for the day is 0
2 CPT incorrect Patient extubated 2131-05-04 20:00:00. CPT incorrectly says they are ventilated the subsequent day (2131-05-05)
3 CPT incorrect Patient put on nebs but not mechanical ventilation
4 CPT incorrect Weaned to trach mask on 17th. CPT states ventilated on 18th.
5 Quantization Patient coded at 1am before vent settings came in
6 CPT incorrect Patient mostly on trach mask, but aperiodically ventilated due to SOB

Quantization occurs because CPT is a daily measure (any ventilation > 30 minutes on the day counts) but the ventilation durations query is updated roughly every 4-6 hours (so patients can be extubated at 1am and considered not ventilated on that day).

All in all the new ventilation-durations query seems to work well. I'll close this as a solved issue now.