v1xerunt / Dr.Agent

Source code for JAMIA paper Dr. Agent: Clinical Predictive Model via Mimicked Second Opinions
5 stars 2 forks source link

Issues related to how data sets are processed and other tool components #1

Open AbramsDkr opened 2 months ago

AbramsDkr commented 2 months ago

Hello,

Thank you for sharing the code. While reproducing your work, I encountered some data-related issues. After processing the data using the mimic3-benchmarks method you suggested, I found that the resulting data format does not match the one expected by your extract_demo.py tool. It seems that your expected data format consists of multiple folders containing patient information, but the mimic3-benchmarks method only produces many CSV files.

When I tried to process the demographic task, the train_decomp.py script filters out files where the first row of the header is not "Icustay". However, the benchmark-processed files have headers starting with "Hour" (there is no "Icustay" header), which prevents the subsequent code from running. Additionally, the decomp_normalizer mentioned in your project does not appear to be provided in the repository. Do you have any suggestions on how to address these issues?

Thank you for your assistance.

v1xerunt commented 2 months ago

Hello, this may be because the benchmark repo has updated its code and data schema. I haven't followed their updates so there maybe some compatibility issues with their current version. If you want to apply the model to the MIMIC-III data, I suggest you refer to their latest release and preprocess the data accordingly.

If you want to apply the model on other datasets, you can refer to our other packages, which have a tidier implementation: PyEHR: https://github.com/yhzhu99/pyehr PyHealth: https://github.com/sunlabuiuc/PyHealth