MLD3 / FIDDLE-experiments

Experiments applying FIDDLE on MIMIC-III and eICU. https://doi.org/10.1093/jamia/ocaa139
24 stars 7 forks source link

Is the code for eICU data complete? #5

Closed Yutong-Dai closed 1 year ago

Yutong-Dai commented 1 year ago

Hi, Thanks for open-sourcing the code. However, it seems that the code snippets hosted in https://github.com/MLD3/FIDDLE-experiments/tree/jamia-replication/eicu_experiments/1_data_extraction only preprocess a subset of tables mentioned in the paper, namely "medication", "nurseCharting", 'patient', 'lab', 'respiratoryCare', 'intakeOutput' tables. Am I missing something?

Thanks, YD

shengpu-tang commented 1 year ago

Hello, as mentioned in the paper, our experiments only considered 18 tables from eICU. We also provide the preprocessed data for our experiments here: https://physionet.org/content/mimic-eicu-fiddle-feature/1.0.0/

If you would like to use other tables in eICU, I suggest you convert them into the same format and apply FIDDLE.

Yutong-Dai commented 1 year ago

Sorry if I didn't make my point clear. For the mimic3_experiments, there is a extract_data.py file that preprocesses various tables. On the contrary, in the eicu_experiments, such a file seems to be missing. In other words, the jupyternotebooks in eicu_experiments seem only to "medication", "nurseCharting", 'patient', 'lab', 'respiratoryCare', and 'intakeOutput' tables.

shengpu-tang commented 1 year ago

Hello, I think some of those files were accidentally left out and unfortunately the original code seem to have been deleted. The code on GitHub processes the following 8 tables:

And the following 10 tables are missing the corresponding data extraction code: customLab, infusionDrug, microLab, note, nurseAssessment, nurseCare, pastHistory, physicalExam, respiratoryCharting, treatment.

I will try my best to look for these, but I cannot guarantee that I can provide the original code since it has been more than 2 years. Since the eICU data is quite large, in general I would recommend against rerunning the preprocessing and just use the preprocessed data on PhysioNet. Otherwise, I highly welcome pull requests for adding the extraction code of these tables. Thank you!