Closed dalgu90 closed 2 years ago
Now I refactored the CAML model. Some util methods still remain in src/utils/caml_utils.py
.
The current performances of CNN/CAML/DR-CAML on the MIMIC-III top-50 dataset are as below. It looks far better than the original CAML paper, and I guess this is because we have much more examples of this dataset. (+we have a different set of top-50 codes)
Vanilla CNN
Checkpoint loaded from best-1.pth
Evaluate on test dataset
prec_at_5: 0.648547
prec_at_8: 0.527984
macro_f1: 0.635654
micro_f1: 0.689384
macro_auc: 0.913144
micro_auc: 0.936200
Save result on results/CNN_mimic3_50/test_result.json
CAML
Checkpoint loaded from best-22.pth
Evaluate on test dataset
prec_at_5: 0.651824
prec_at_8: 0.533704
macro_f1: 0.615738
micro_f1: 0.667918
macro_auc: 0.914351
micro_auc: 0.940175
Save result on results/CAML_mimic3_50/test_result.json
DR-CAML
Checkpoint loaded from best-23.pth
Evaluate on test dataset
prec_at_5: 0.651144
prec_at_8: 0.532854
macro_f1: 0.628317
micro_f1: 0.672664
macro_auc: 0.914375
micro_auc: 0.940238
Save result on results/DRCAML_mimic3_50/test_result.json