adriancorrendo / metrica

Prediction Performance Metrics
https://adriancorrendo.github.io/metrica/
Other
72 stars 8 forks source link

Garbage results if some classes are missing in predictions #41

Open jibaer-izw opened 1 month ago

jibaer-izw commented 1 month ago

If the number of different classes in truth and prediction is not equal, the score functions produce garbage because the confusion matrix is not quadratic and diag(matrix) gives wrong result.

In my view, all functions must rewritten, or a sanity check for quadratic confusion matrix must introduced. At the current state the package produce sometimes weird results like Recall = 6.457

adriancorrendo commented 1 month ago

Hi @jibaer-izw thanks for bringing this issue up.

First comment is yes, "garbage in" = "garbage out". The package is supposed to work only with clean predicted and observed data. Data quality checks are beyond the scope of the package and it's exclusive responsibility of the user to explore and ensure good data quality.

The easiest way to solve this would be to pass a drop_na() / na.omit = TRUE.

Being that said, I could potentially add an error message if the data has missing values but again, the user can easily sort this issue out with a quick and easy check of the data before running any function of the package.

About the quadratic confusion matrix, are you willing to try writing code to implement that as a potential way out for missing data? If you do so, I will add you as a contributor of the package of course.

Cheers, Adrian

jibaer-izw commented 1 month ago

Hi Adrian, it is no uncommon in machine learning that sometimes one or more classes do not occur in the predictions. In theses cases the fields on the diagonale of the confusion matrix are not y_pred==y_true and the whole method used in your package fails.

I am no R expert (Python & skLearn is much better for ML), but i have programmed an ad hoc solution for the scores that work with missing classes in the predictions:

ConfusionDF <- function(y_true, y_pred) {
  DF <- transform(as.data.frame(table(y_true, y_pred)),
                          y_true = as.character(y_true),
                          y_pred = as.character(y_pred),
                          Freq = as.integer(Freq))
  return (DF)
}

# sum the fields in the confusion matrix and return TP, FP, FN for each class in y_true
# as list of named  numerics
calcScoreSums <-function( y_true, y_pred) {

  df = ConfusionDF(y_true, y_pred)

  classes = sort(unique(y_true))

  TP = numeric(length(classes))
  FP = numeric(length(classes))
  FN = numeric(length(classes))

  i = 1
  for (class in classes) {
    TP[i] = sum(df[df$y_true == class & df$y_pred == class, ]$Freq)
    FP[i] = sum(df[df$y_true != class & df$y_pred == class, ]$Freq)
    FN[i] = sum(df[df$y_true == class & df$y_pred != class, ]$Freq)
    i = i+1
  }
  names(TP) = classes
  names(FP) = classes
  names(FN) = classes

  return (list(TP=TP, FP=FP, FN=FN ))
}

Calculating Precision, Recall and F1-score from TP,FP,FN is simple ...

Feel free to use this method in your package.

Cheers JIBaer