dalgu90 / icd-coding-benchmark

Automatic ICD coding benchmark based on the MIMIC dataset
MIT License
35 stars 5 forks source link

Implement Fusion model #40

Closed dalgu90 closed 2 years ago

dalgu90 commented 2 years ago

Hello guys.

I ported the Fusion model. You can see the paper's hyper-parameters in the config files (configs/fusion). Paper: https://aclanthology.org/2021.findings-acl.184/ GitHub repo: https://github.com/machinelearning4health/Fusion

NOTE: In the paper, the performance on the MIMIC-III top-50 (equivalent to our top-50 old dataset) was made by training on the full dataset (with 5x examples). This dataset is equivalent to our new top-50 dataset, so we compared it below. The authors now share the results on the original top-50 dataset, which we compared in the MIMIC-III top-50 (old).

Code Macro AUC Micro AUC Macro F1 Micro F1 P@5 Note
Author 0.894 0.924 0.598 0.659 0.635 Updated in the repo
Ours 0.904610 0.929229 0.611743 0.674127 0.640023
Code Macro AUC Micro AUC Macro F1 Micro F1 P@5 Note
Author 0.931 0.950 0.683 0.725 0.679 Top-50 in the paper (or enhanced 50 settings in the repo)
Ours 0.932295 0.952335 0.660386 0.726407 0.678726
Code Macro AUC Micro AUC Macro F1 Micro F1 P@8 Note
Author 0.915 0.987 0.083 0.554 0.736 MIMIC-III full In the paper
Ours 0.907964 0.986258 0.079416 0.559838 0.747628
Code Macro AUC Micro AUC Macro F1 Micro F1 P@8 Note
Ours 0.912643 0.986653 0.078532 0.556019 0.743105

Here's the raw output of our implementation:

abheesht17 commented 2 years ago

@dalgu90, forgot to mention one thing - please add the results for Fusion to the README in this PR itself

dalgu90 commented 2 years ago

Thanks @abheesht17 for the detailed review. Let me fix them in the later commit (after our next meeting).

abheesht17 commented 2 years ago

Thanks @abheesht17 for the detailed review. Let me fix them in the later commit (after our next meeting).

Sure! :)

dalgu90 commented 2 years ago

Hey team. I reflected the comments except the ones with fixing the class/layer names.