MIT-LCP / eicu-code

Code and website related to the eICU Collaborative Research Database
https://eicu-crd.mit.edu
MIT License
308 stars 212 forks source link

Cardiopulmonary resuscitation #62

Closed dr-romster closed 5 years ago

dr-romster commented 5 years ago

Is there a way to determine if patients received CPR / suffered cardiac arrest prior to (or during) ICU admission?

dr-romster commented 5 years ago

Partially solved : Information on pre-ICU arrests is in apachepredvar -> admitdiagnosis

SELECT "admitdiagnosis", COUNT(*) AS "n"
FROM (SELECT "patientunitstayid", "admitdiagnosis"
FROM eicu.apachepredvar) as "adm_diags"
GROUP BY "admitdiagnosis";
alistairewj commented 5 years ago

You might also want to take a look at the admissionDx table. It will contain the same admit diagnosis as you've found here, but with a bit more detail. It also contains some miscellaneous extra variables - e.g. acute MI location, pre-OP MI during hospitalization, etc. Definitely some usable information there.

select admitdxpath, count(*)
from admissiondx
-- filter out the apache admission diagnoses
where lower(admitdxpath) not like 'admission diagnosis|all diagnosis%'
group by 1
order by 1;
DrSavvas commented 5 years ago

Can I clarify something: what's better in order to answer Dr Romster's question: the ICD9 diagnosis variable in the Diagnosis table which has 427.5 and I46 codes for cardiac arrest (used with an admission offset window of, say, 24h), the apacheadmissiondx on table Patients or the one mentioned above on table Admission? Thanks guys S

tompollard commented 5 years ago

@DrSavvas it would make sense to use information from all of the different sources. e.g. if ICD codes do not capture an event, but admissionDx does, then be explicit about your approach for handling disagreement. A good first step would be some exploratory analysis to understand these variables and how they overlap.

DrSavvas commented 5 years ago

Yes, that’s what I’m doing at the moment, was just wondering whether you guys had any preformed opinions on this. Thanks Tom

tompollard commented 5 years ago

Thanks @DrSavvas, I don't have any advice at this stage, but there may be others who would like to comment. It would be interesting to see the results of any exploratory work posted here.

DrSavvas commented 5 years ago

Dear Tom, quick question: in the Nature paper, you mention that 'A stratified random sample of patients was used to select patients for inclusion in the public dataset' and then go on to describe the sampling process. What is the rough percentage of the initial sample that was included in the public dataset?

DrSavvas commented 5 years ago

Dear Tom A. Thanks for the super-speedy replies B. I’m looking at volume-outcome relationship in cardiac arrest and in order to back-calculate ICU-specific volume I would ideally need

  1. The actual number of all admissions per ICU over the two years or
  2. the weighting factor used for each ICU during the sampling process I appreciate it might not be possible for you to share any of this but if it is possible, it would give the dataset a huge potential in terms of health systems research. The NIS database does share the weighting variable so one can back-calculate actual (estimates) patient volume. Happy to discuss this offline if you need more information. Thanks again, Savvas
jraffa commented 5 years ago

I think to do calculations like what you're suggesting may require too many assumptions. The sampling was done on a patient basis, not by admission.

With that said, the weight was fixed across all hospitals for the index ICU stay that a patient would be selected (and subsequently have all their 2014-15 stays included), somewhere between 0.2-0.3. I don't have a problem disclosing it per se, but I fear it may be misused, if people don't understand the complete process.

DrSavvas commented 5 years ago

Hi Jesse, I'm an epidemiologist so I think I get SRS and I understand that in this case it is patient-based and not admission-based. If sampling was performed as Stratified Random Sampling, with each hospital (stratum) contributing [insert here weighting variable: 0.1 or 0.2-0.3?] of their patients to the final sample then it should be easy to back-calculate the actual patient volume for each hospital for the study period. The relationship does not apply to the number of admissions (obviously), since each included patient had all their admissions included. As a statistician, you might note that random sampling based on hospital strata does not ensure that the distribution of diagnoses within each hospital stratum remains stable; the random process itself may randomly lead to over- or under-representation of a particular diagnosis. That is an unavoidable risk and I'm not aware of a method to overcome it; the repetition of the process by inclusion of multiple strata (hospitals) should ensure an averaging out of the resulting bias. I am interested only in patients' first (index) admission for cardiac arrest, not any subsequent ones and thus I would need the number of patients with a cardiac arrest diagnosis, not the number of admissions. Please correct me if I'm wrong and, as mentioned above, I'd gladly share details of my protocol but would prefer to not do that in a public forum. Thanks again!

jraffa commented 5 years ago

I may be missing something, but I do not understand how knowing the sampling weights add to your analysis. Can you send me a concise statement of your objective? If not here, e-mail (mygithubusername) at mit.edu.

DrSavvas commented 5 years ago

I sent an email yesterday, pretty sure it’s the right address. Thanks

tompollard commented 5 years ago

@DrSavvas it's being discussed in a meeting right now...

DrSavvas commented 5 years ago

Sorry guys. Didn’t mean to cause chaos...

tompollard commented 5 years ago

No problem, it's a useful point for us to clarify :) We'll be updating the documentation shortly.

DrSavvas commented 5 years ago

Hi guys, haven't heard back since sending Jesse an email. Any updates on what you suggest I do? If it is possible to get a "whole dataset" patient number per hospital, that would be a great help and it's not identifiable data Thanks

jraffa commented 5 years ago

Still working on it. I do not believe we will be releasing any whole dataset information, as I do not see how we can sustainably answer such queries.

The sampling weights are not sufficient to do your study, and we're working on releasing some additional information, but will likely not be able to do so until at least the end of the month. Even then, I would encourage skepticism regarding any perceived precision you will get from this approach.

DrSavvas commented 5 years ago

Ok fair enough. I understand the issues re sustainability and I get your point re sampling weights. I will split the hospitals in lower volume and higher volume, based on total admissions in the dataset. Since it is a stratified sample, it should reflect real lower and higher volume classification in the eRI dataset. Apologies for all the inconvenience but I wanted to do it right and not "taint" the dataset with an inaccurate/unresearched publication. This is nuanced stuff and not all reviewers will pick it up. Thanks again S