ycq091044 / SafeDrug

IJCAI2021: Code for SafeDrug, MIMIC data processing, Medical code mapping
70 stars 16 forks source link

Question about data processing #27

Closed He-Yichen closed 9 months ago

He-Yichen commented 9 months ago

Hi, The paper claims that you filter out the patients with only one visit. But I found that there are still 908 records in the data that contain one visit. Could you help look into the problem? image

ycq091044 commented 9 months ago

Hi @He-Yichen, thanks for your question. Could you please inform me how did you get the records_final.pkl? If you use the processing.py to generate this .pkl file, then it should not contain patient with #visit = 1. You could check #Line 477 in processing.py which filters out those patients. Anyway, you could also manually remove those patients as well.

He-Yichen commented 9 months ago

Hi,@ycq091044, I've tried getting the records_final.pkl from this master branch, and I also tried getting the records_final.pkl from processing.py. But they have the same result, I was puzzled by this question for quite some time. The current statistics of my records_final.pkl are shown below:

patients (6350,)

clinical events 15032

diagnosis 1958

med 112

procedure 1430

avg of diagnoses 10.5089143161256

avg of medicines 11.647751463544438

avg of procedures 3.8436668440659925

avg of vists 2.367244094488189

max of diagnoses 128

max of medicines 64

max of procedures 50

max of visit 29