Closed xiangxingGuo closed 6 months ago
Sorry for inconvenience... It has been more than one years since I graduated from graduate school, and I'm not sure if the original files of 'data_split_fixed' are still on the lab server at the graduate school.
If you execute the "python ehr_federated/preprocess.py --data_path [data_storage_path]", you will generate a bunch of {icustay_id}.pt files.
The "data_split_fixed" folder pertains to these icustay_ids.
Within this folder, there should be json files corresponding to each hospital_ids.
Each json file must include train/valid/test "icustay_id" based on their respective hospital_id. Also, I randomly split the ICU stays for each client into train/valid/test using 7:1.5:1.5 ratio.
If you utilize the code below properly, you can reproduce it.
from sklearn.model_selection import train_test_split
test_size = 0.15
val_size = 0.15
train_data, test_val_data = train_test_split(data, test_size=(test_size + val_size), random_state=42)
val_data, test_data = train_test_split(test_val_data, test_size=(test_size / (test_size + val_size)), random_state=42)
Thank you.
https://github.com/wns823/medical_federated/blob/5b34d4457aa72d09b272c4dd35e1f4e055fd4202/ehr_federated/ehr_federated.py#L35
Here there, I cannot find
data_split_fixed
after usingpreprocess.py
and I came acrossFileNotFoundError: [Errno 2] No such file or directory: 'data_storage/eicu-2.0/federated_preprocessed_data/data_split_fixed/73_ver2.json'
.