Issues related to how data sets are processed and other tool components

v1xerunt / Dr.Agent

Source code for JAMIA paper Dr. Agent: Clinical Predictive Model via Mimicked Second Opinions

5 stars 2 forks source link

Hello,

Thank you for sharing the code. While reproducing your work, I encountered some data-related issues. After processing the data using the mimic3-benchmarks method you suggested, I found that the resulting data format does not match the one expected by your extract_demo.py tool. It seems that your expected data format consists of multiple folders containing patient information, but the mimic3-benchmarks method only produces many CSV files.

When I tried to process the demographic task, the train_decomp.py script filters out files where the first row of the header is not "Icustay". However, the benchmark-processed files have headers starting with "Hour" (there is no "Icustay" header), which prevents the subsequent code from running. Additionally, the decomp_normalizer mentioned in your project does not appear to be provided in the repository. Do you have any suggestions on how to address these issues?

Thank you for your assistance.

v1xerunt / Dr.Agent

Issues related to how data sets are processed and other tool components #1