floft / vrada

Variational Adversarial Deep Domain Adaptation implementation (TensorFlow 1.x)
10 stars 4 forks source link

mimic and postgreSQL version #1

Open Neronjust2017 opened 3 years ago

Neronjust2017 commented 3 years ago

Hi, I have downloaded mimic-iii and installed postgreSQL. However, when I run the scripts, I encountered some errors. So I wonder which version of both mimic and postgreSQL you use? Thanks!

Neronjust2017 commented 3 years ago

I have tried mimic code (v1.4.1~the latest), but always "make: *** No rule to make target 'concepts48'. Stop." It seems that Makefile does't contains "concepts48" ?

datadir="../../../../mimic-iii/" cd datasets/mimic-code/buildmimic/postgres make mimic-gz datadir="$datadir" make concepts make concepts48 cd -

floft commented 3 years ago

I think I used the mimic code in the subdirectories of https://github.com/itsgarrettw/vrada/tree/master/datasets which has a concepts48 in the makefile (see https://github.com/itsgarrettw/mimic-code/commit/9fe51b598c03fc5e253d35d737397991770690a3).

However, while eventually I got the code to run on the ICD9 data, I never ended up reproducing the results in the paper (so, I might not have gotten the same data they used or may have processed it differently somehow). Another paper [1] had a footnote: "one study required identification of patients with acute hypoxemic respiratory failure (AHRF), a diagnosis which would require free-text processing, specifically identifying chest radiograph reports for mention of bilateral infiltrates (Khemani et al., 2009; Purushotham et al., 2017)." After reading that, I stopped trying to reproduce the results but left my attempt at it here in case anyone else found it useful. I ended up primarily running this method on other datasets that were easier to find/create reliably instead.

[1] http://proceedings.mlr.press/v68/johnson17a/johnson17a.pdf

Edit: note that there's TensorFlow 2.0 code for VRADA included in my CoDATS repo (it's one of the baselines) https://github.com/itsgarrettw/codats (e.g. see https://github.com/itsgarrettw/codats/blob/294ccca16e545f5e51ca9d3cb971fcab1625b5a3/methods.py#L714) However, for that code I didn't use the MIMIC datasets due to the above-mentioned problems.

Neronjust2017 commented 3 years ago

Hi, Thanks a lot! I haven't used the subdirectories mimic code due to my carelessness. I'm re-running the scripts now and according to your suggesstion, I run them on a SSD. But it seems jnb "8_processing.ipynb" takes very long time, and it has been 10 hours. How long will this process last, according to your experience?

It is unfortunate that you didn't reproduce the results in the paper. Do you mean that you didn't reproduce the results on both Prediction Tasks (Mortality Prediction with AHRF dataset and ICD9 Code Prediction with ICD9 dataset)? How big is the gap between the results you reproduced and the ones in the paper? Because I want to try some ideas on this data, and if the gap is big, I maybe give it up. Thanks!

floft commented 3 years ago

From my estimates it was going to take 68 days to do notebook 8 on my non-SSD hard drive, so I copied the entire database into RAM (assuming you have enough ram; I moved the postgres database to /dev/shm on a computer with 256GiB of memory) and then it took more like 6 days (I think that's total, not just notebook 8). I'm guessing an SSD would take somewhere around that or possibly longer.

Unfortunately I don't have the results/values anymore, so I don't know how much different the numbers were. But yeah, I didn't reproduce the results on either of the tasks. I'm guessing the discrepancy was because of the data processing code / notebooks.

My recommendation would be to use different datasets/tasks extracted from MIMIC-III. For example, there's code available for that paper by Johnson et al. https://github.com/alistairewj/reproducibility-mimic reproducing various results using a (different?) mortality prediction task. There's likely more recent papers with code available too, (hopefully) also including the code that extracts the task from MIMIC-III.

Neronjust2017 commented 3 years ago

Thanks a lot! I will check some other tasks from MIMIC-III and see what can I do with it.