frederikkemarin / BEND

Benchmarking DNA Language Models on Biologically Meaningful Tasks
BSD 3-Clause "New" or "Revised" License
97 stars 13 forks source link

Clarification on MCC Calculation Method #56

Closed yangzhao1230 closed 6 months ago

yangzhao1230 commented 6 months ago

I noticed that in the code, MCC is calculated and then combined with precision and recall for each class into a single list. Could you please explain the rationale behind this approach?

  if self.config.params.metric == 'mcc':
      metric =  matthews_corrcoef(y_true.numpy().ravel(), y_pred.numpy().ravel())
      recall = recall_score(y_true.numpy().ravel(), y_pred.numpy().ravel(), average=None).tolist()
      precision = precision_score(y_true.numpy().ravel(), y_pred.numpy().ravel(), average=None).tolist()
      #tp = confusion_matrix(y_true.numpy().ravel(), y_pred.numpy().ravel(), normalize='true').diagonal().tolist()
      metric = [metric] + recall + precision #[list(i) for i in zip(recall, precision)]
fteufel commented 6 months ago

Hi, we report the single-class metrics in addition to the MCC, and we return them all together as a list. The MCC itself is not affected by this. We show the additional metrics in the supplement.

yangzhao1230 commented 6 months ago

Thank you for your quick response.

I checked the code again and found that you average all the metrics only during the validation stage. val_metric = np.mean(val_metric).

I initially thought you were reporting this average metric as the result in the paper.