MIT-LCP / mimic-code

MIMIC Code Repository: Code shared by the research community for the MIMIC family of databases
https://mimic.mit.edu
MIT License
2.58k stars 1.51k forks source link

Smoking status in MIMICIII_Derived dataset #1802

Closed hakim89 closed 3 weeks ago

hakim89 commented 3 weeks ago

Prerequisites

Description

Description of the issue, including:

I am trying to look for the smoking status in the MIMIC3 dataset. Can't find any columns when I search the information schema DB.

 SELECT
  *
 FROM
  `physionet-data.mimiciii_derived.INFORMATION_SCHEMA.COLUMN_FIELD_PATHS`
 WHERE
  column_name like "%smok%"

referring to the mechanical power repository here. https://github.com/alistairewj/mechanical-power/blob/master/sql/mpwr-smoker.sql look like it can be done in MSFT sql but the same functionality can't be done in google Bigquery.

If you can point me to where the smoking hx field is located, I would greatly appreciate it. If one is not available, I will try to rewrite the above smoking hx query to use Lucene/Bigquery syntax.

Thank you.

alistairewj commented 3 weeks ago

You should be able to adapt that query pretty easily to BigQuery. It is looking in the discharge summaries to find mentions of smoking history, as usually the smoking history is described in the social history section. You'll need to use the mimiciii_notes dataset instead of mimiciii_clinical on BigQuery though - that's where the noteevents table is.

hakim89 commented 3 weeks ago

Thank you. I was able to convert the regex to BigQuery syntax like below. WHEN REGEXP_CONTAINS(ne.text, r'(never|not|not a|none|non|no|no history of|no h\/o of|denies|denies any|negative)[\s-]?(smoke|smoking|tabacco|tobacco|cigar|cigs)') Thank you for your reply.