lrsoenksen / HAIM

This repository contains the code to replicate the data processing, modeling and reporting of our Holistic AI in Medicine (HAIM) Publication in Nature Machine Intelligence (Soenksen LR, Ma Y, Zeng C et al. 2022).
Apache License 2.0
104 stars 27 forks source link

Experiments results #7

Closed malika996 closed 1 year ago

malika996 commented 1 year ago

Hello! I am trying to reproduce some of the results from your paper. In particular, I am interested in getting a plot like the one below to find out what combination of modalities justifies a multimodal approach compared to a visual only.

haim_roc

For example, for fracture, the smallest data, I was able to get a 5-fold cross-validation test average macro AUROC of about 0.78 for the unimodal model (fusing per-image and multi-image dense visual embeddings), but when I add new (and less informative) modalities to it, the results stay almost the same (somewhere getting a bit better, somewhere a bit worse). Perhaps because XGBoost handles the curse of dimensionality well. Since the number of combinations of input modalities is high,1023, I only tested a subset, but could not get close to 0.84 in average macro AUROC.

Could you please share the supportive information about the plot above like what combination of modalities is considered typical?

Also about the number of experiments performed in the article. I understand how you got 1023 as the number of possible models for pathology diagnosis tasks. 1023 = Number of models of 1 modality + Number of models of 2 modality + Number of models of 3 modality + Number of models of 4 modality.

Where the number of models of 1 modality is calculated based on the number of combinations of the corresponding sources: Tabular: 1 Time series: C(3, 1) + C(3, 2) + C(3,3) = 3 + 3 + 1 = 7 Notes (excluding radiology): C(2,1) + C(2,2) = 2 + 1 = 3 Visual: C(4,1) + C(4,2) + C(4,3) + C(4,4) = 4 + 6 + 4 + 1 = 15

Total: 26

And so on, up to 4 modalities. I also get a total of 1023 experiments.

However, I don't get the same number of experiments for the 48-hours length of stay and mortality prediction tasks, for which the difference is that radiology notes are included. Could you please explain how you get 2047(2046)?

Thank you!

lrsoenksen commented 1 year ago

Hi @malika996,

All supporting information can be accessed via the journal publication (https://static-content.springer.com/esm/art%3A10.1038%2Fs41746-022-00689-4/MediaObjects/41746_2022_689_MOESM1_ESM.pdf). Regarding the number of experiments, there are 4 modalities, but there are actually 11 sources (as seen in Supplemental Table 2) if you do your calculation of combinations for the 11 sources (differentiating on modality when doing the combinations) then you will see how we get the number of models required to prove in our study. You can also see detail on every model in the supplementals (Page 7 of the supplemental PDF, Supplemental Fig. 1). Regarding the plots, all the code used to extract the data to perform the plots is included in the code we provide. However, the actual plots were graphed using GraphPad Prism to do the pretty plots for publication.