NOTE: I re-created a new branch and a new PR #39 since this branch has an issue. Please review #39 instead.
Hi, I would like to change the MacroAUC metric since computing takes too much time with only one thread when there are a lot of labels (like the MIMIC-III full dataset).
The number of processes to compute macro AUC can be configured by the parameter num_process. If the parameter is not specified, it will automatically be set to min(# CPU cores, 16). You can configure not to use multiprocessing by setting it less than or equal to 1.
Here's the change in the computation time on the MIMIC-III full dataset (The machine has 32 CPU cores, so num_process is 16)
Before changing to multiprocessing (2min 15sec)
2022-04-27 13:24:00,607 — src.trainers.base_trainer — INFO — Epoch: 54/200, Step 161028
Epoch 54: 100%|█████████████████████████████████████████████████████| 2982/2982 [35:26<00:00, 1.40it/s, Train Loss: 0.003833]
2022-04-27 13:59:50,691 — src.trainers.base_trainer — INFO — Evaluate on val dataset
2022-04-27 13:59:52,004 — src.trainers.base_trainer — INFO — prec_at_8: 0.718194
2022-04-27 13:59:53,312 — src.trainers.base_trainer — INFO — prec_at_15: 0.558022
2022-04-27 13:59:55,533 — src.trainers.base_trainer — INFO — macro_f1: 0.047854
**2022-04-27 13:59:57,781 — src.trainers.base_trainer — INFO — micro_f1: 0.515827
2022-04-27 14:02:12,639 — src.trainers.base_trainer — INFO — macro_auc: 0.897329**
2022-04-27 14:02:20,480 — src.trainers.base_trainer — INFO — micro_auc: 0.984774
2022-04-27 14:04:47,649 — src.trainers.base_trainer — INFO — Checkpoint saved to ckpt-54.pth
2022-04-27 14:04:49,153 — src.trainers.base_trainer — INFO — Checkpoint saved to best-54.pth (prec_at_8: 0.718194)
After changing to multiprocessing (3 sec)
2022-04-27 14:05:46,050 — src.trainers.base_trainer — INFO — Epoch: 55/200, Step 164010
Epoch 55: 100%|█████████████████████████████████████████████████████| 2982/2982 [38:32<00:00, 1.29it/s, Train Loss: 0.003814]
2022-04-27 14:44:51,776 — src.trainers.base_trainer — INFO — Evaluate on val dataset
2022-04-27 14:44:53,050 — src.trainers.base_trainer — INFO — prec_at_8: 0.718654
2022-04-27 14:44:54,352 — src.trainers.base_trainer — INFO — prec_at_15: 0.558512
2022-04-27 14:44:56,689 — src.trainers.base_trainer — INFO — macro_f1: 0.048206
**2022-04-27 14:44:58,913 — src.trainers.base_trainer — INFO — micro_f1: 0.516399
2022-04-27 14:45:01,965 — src.trainers.base_trainer — INFO — macro_auc: 0.897796**
2022-04-27 14:45:09,434 — src.trainers.base_trainer — INFO — micro_auc: 0.984819
2022-04-27 14:45:30,023 — src.trainers.base_trainer — INFO — Checkpoint saved to ckpt-55.pth
2022-04-27 14:45:31,691 — src.trainers.base_trainer — INFO — Checkpoint saved to best-55.pth (prec_at_8: 0.718654)
NOTE: I re-created a new branch and a new PR #39 since this branch has an issue. Please review #39 instead.
Hi, I would like to change the
MacroAUC
metric since computing takes too much time with only one thread when there are a lot of labels (like the MIMIC-III full dataset). The number of processes to compute macro AUC can be configured by the parameternum_process
. If the parameter is not specified, it will automatically be set to min(# CPU cores, 16). You can configure not to use multiprocessing by setting it less than or equal to 1. Here's the change in the computation time on the MIMIC-III full dataset (The machine has 32 CPU cores, so num_process is 16)Before changing to multiprocessing (2min 15sec)
After changing to multiprocessing (3 sec)