sunlabuiuc / PyHealth

A Deep Learning Python Toolkit for Healthcare Applications.
https://pyhealth.readthedocs.io
MIT License
956 stars 207 forks source link

question about MIMIC-III in drug recommendation task #206

Closed He-Yichen closed 1 year ago

He-Yichen commented 1 year ago

The MIMIC-III dataset used in many of the papers (eg. SafeDrug, GAMENet, MoleRec) consists of 50,206 medical encounter records. By filtering out the patients with only one visit, they would contain 14,995 visits and 6,350 patients, In the code of drug_recommendation_mimic3_fn, they appear to have the same task as in the paper, but using "mimic3_ds= mimic3_ds.set_task(task_fn=drug_recommendation_mimic3_fn)" would only produce 911 patients and 1858 Visits, why is this?

ycq091044 commented 1 year ago

Thanks for your question. We will look into it.

ycq091044 commented 1 year ago

Hello, I think you should use the "dev=False" mode to call the MIMIC3Dataset. Here are my screenshots:

image
He-Yichen commented 1 year ago

@ycq091044 Hi,here are my screenshots. Why do I get different results than yours? Is it because the "root" parameter is different? image

ycq091044 commented 1 year ago

Ok, you are using the synthetic MIMIC-III data (which is generated by ourselves and it is not the real MIMIC-III), since we are not allowed to distribute the real MIMIC-III data and you have to request it from https://physionet.org/content/mimiciii/1.4/