MIT-LCP / mimic-code

MIMIC Code Repository: Code shared by the research community for the MIMIC family of databases
https://mimic.mit.edu
MIT License
2.43k stars 1.5k forks source link

Building a db file on a subset of the MIMIC-III data #1579

Open pshuwei opened 1 year ago

pshuwei commented 1 year ago

Prerequisites

Description

Description of the issue, including:

I am just curious if these codes can be run to build a db file on a subset of the data.

Thanks!

alistairewj commented 1 year ago

What format do you expect a db file to be?

pshuwei commented 1 year ago

What format do you expect a db file to be?

To clarify, I have managed to take the shell program that compiles all csv.gz files into a single SQLite database file. I was just wondering if I could do the same thing, but with let's say 10% of the MIMIC-III patients, or any fraction of the dataset.

alistairewj commented 1 year ago

Yes for sure! You can run the same code using the demo dataset: https://physionet.org/content/mimiciii/

That would give you a 100 patient subset.

pshuwei commented 1 year ago

Hi thanks for your response,

Does this MIMIC demo dataset contain the same amount of information for 100 patients, or is it simply a condensed version?

Also what if I wanted to increase from 100 to 200 patients? How would I go about that?

alistairewj commented 1 year ago

The demo dataset is simply a filter on all the tables in the database, requiring the subject_id to be in a list of 100 apriori selected subject_id. We also remove the noteevents table.

You can easily recreate this if you have the full dataset and expand the subject_id list.