About Fig5 in paper - Githubissues

Hi @WangMengxiao319 , I'm very sorry for my late reply!

For demonstration, we carried out such a comparable analysis for IVCD in order to understand the comparably weak classification performance on the particular statement compared to other conduction disturbances. Indeed, clustering the model’s output probabilities with k-means clustering revealed two clusters, where one cluster performed much better than the other as can be seen in Fig. 5. Interestingly, it turned out that the two clusters largely align with the presence/absence of NORM as additional ECG statement.

As already stated in the paper, we selected the samples with IVCD as part of the respective label-set (remember that the labels are multi-label). For this samples, we computed output probabilities (piece-wise sigmoids), which we clustered with K-Means. Each cluster was evaluated separately, revealing clusters with higher and lower errors correlated with the co-occurence of NORM. It's really as simple as that, no further magic here.

helme / ecg_ptbxl_benchmarking

About Fig5 in paper #29