Confusion matrix - Githubissues

stephenturner commented 3 years ago

caret has a nice confusion matrix function (source code). Prints lots of useful stats on each class, as well as overall stats. Example below.

The caret package has a ton of dependencies, and the confusionMatrix() function itself requires several of the additional dependencies in Suggests. It can be finicky to install on any OS, and I'd rather not include it as a dependency.

Can we pull out the useful bits from the source code, or else find another lighter weight function elsewhere we can copy over that could give us similar stats/functionality?

> caret::confusionMatrix(data=result$degree.est, reference=result$degree.sim)
Confusion Matrix and Statistics
            Reference
Prediction     0   1   2   3 unreleated
  0            0   2   0   0          0
  1            0  40  10   1          0
  2            0   3  10   2          1
  3            0   0   3   1         53
  unreleated   0   0   7   1        646
Overall Statistics
               Accuracy : 0.8936          
                 95% CI : (0.8698, 0.9144)
    No Information Rate : 0.8974          
    P-Value [Acc > NIR] : 0.6649          
                  Kappa : 0.5612          
 Mcnemar's Test P-Value : NA              
Statistics by Class:
                     Class: 0 Class: 1 Class: 2 Class: 3 Class: unreleated
Sensitivity                NA  0.88889  0.33333 0.200000            0.9229
Specificity          0.997436  0.98503  0.99200 0.927742            0.9000
Pos Pred Value             NA  0.78431  0.62500 0.017544            0.9878
Neg Pred Value             NA  0.99314  0.97382 0.994467            0.5714
Prevalence           0.000000  0.05769  0.03846 0.006410            0.8974
Detection Rate       0.000000  0.05128  0.01282 0.001282            0.8282
Detection Prevalence 0.002564  0.06538  0.02051 0.073077            0.8385
Balanced Accuracy          NA  0.93696  0.66267 0.563871            0.9114

cc @vpnagraj

vpnagraj commented 3 years ago

another option besides caret:

https://github.com/m-clark/confusionMatrix/

usage:

devtools::install_github('m-clark/confusionMatrix')

library(confusionMatrix)
library(tidyverse)

pred <-
  c(rep("first", 48), 
  rep("second",55),
  rep("third",41),
  rep("unrelated", 56))

truth <-
  c(rep("first",50),
    rep("second",50),
    rep("third",50),
    rep("unrelated",50))

confusion_matrix(
  prediction = pred,
  target = truth,
  return_table = TRUE
) %>%
  purrr::pluck("Other") %>%
  gather(Measure, Value, -Class,-N) %>%
  select(-N) %>%
  spread(Class, Value) %>%
  knitr::kable()

Measure	Average	first	second	third	unrelated
AUC	0.9999780	0.9999968	0.9999978	0.9999205	0.9999969
Balanced Accuracy	0.9633333	0.9800000	0.9833333	0.9100000	0.9800000
D Prime	6.3160772	6.5040988	6.5873255	5.6687856	6.5040988
Detection Prevalence	0.2500000	0.2400000	0.2750000	0.2050000	0.2800000
Detection Rate	0.2362500	0.2400000	0.2500000	0.2050000	0.2500000
F1/Dice	0.9441170	0.9795918	0.9523810	0.9010989	0.9433962
FDR	0.0495130	0.0000000	0.0909091	0.0000000	0.1071429
FNR	0.0550000	0.0400000	0.0000000	0.1800000	0.0000000
FOR	0.0174404	0.0131579	0.0000000	0.0566038	0.0000000
FPR/Fallout	0.0183333	0.0000000	0.0333333	0.0000000	0.0400000
NPV	0.9825596	0.9868421	1.0000000	0.9433962	1.0000000
PPV/Precision	0.9504870	1.0000000	0.9090909	1.0000000	0.8928571
Prevalence	0.2500000	0.2500000	0.2500000	0.2500000	0.2500000
Sensitivity/Recall/TPR	0.9450000	0.9600000	1.0000000	0.8200000	1.0000000
Specificity/TNR	0.9816667	1.0000000	0.9666667	1.0000000	0.9600000

pros:

only dependency for that package is dplyr
already in a tidy format

cons:

doesnt return the actual confusion matrix? might be a way ...
only released on github (but we can get around with remotes ... see https://cran.r-project.org/web/packages/remotes/vignettes/dependencies.html)

@stephenturner what do you think?

stephenturner commented 3 years ago

Nice find. Looking through the code there isn't a whole lot of complex stuff going on here. I might just copy all the functions into a single .R file in skater.

stephenturner commented 3 years ago

I still have some work to do here, but I've taken the code from m-clark/confusion_matrix, cut out the stuff we don't want (the agreement calculations, and the stats that throw errors), credited m-clark, and added a contingency table to the returned output. Demonstration below, pulling out the contingency table, and outputting in long format then re-widening. PR in #27

cc @vpnagraj @genignored

library(skater)

prediction = c(rep(1, 50), rep(2, 40), rep(3, 60))
target     = c(rep(1, 50), rep(2, 50), rep(3, 50))

confusion_matrix(prediction, target)
#> $Accuracy
#> # A tibble: 1 x 5
#>   Accuracy `Accuracy LL` `Accuracy UL` `Accuracy Guessing` `Accuracy P-value`
#>      <dbl>         <dbl>         <dbl>               <dbl>              <dbl>
#> 1    0.933         0.881         0.968               0.333           3.36e-54
#> 
#> $Other
#> # A tibble: 4 x 15
#>   Class     N `Sensitivity/Re… `Specificity/TN… `PPV/Precision`   NPV `F1/Dice`
#>   <chr> <dbl>            <dbl>            <dbl>           <dbl> <dbl>     <dbl>
#> 1 1        50            1                1               1     1         1    
#> 2 2        50            0.8              1               1     0.909     0.889
#> 3 3        50            1                0.9             0.833 1         0.909
#> 4 Aver…    50            0.933            0.967           0.944 0.970     0.933
#> # … with 8 more variables: Prevalence <dbl>, `Detection Rate` <dbl>, `Detection
#> #   Prevalence` <dbl>, `Balanced Accuracy` <dbl>, FDR <dbl>, FOR <dbl>,
#> #   `FPR/Fallout` <dbl>, FNR <dbl>
#> 
#> $Table
#>          Target
#> Predicted  1  2  3
#>         1 50  0  0
#>         2  0 40  0
#>         3  0 10 50

confusion_matrix(prediction, target) %>% purrr::pluck("Table")
#>          Target
#> Predicted  1  2  3
#>         1 50  0  0
#>         2  0 40  0
#>         3  0 10 50

confusion_matrix(prediction, target, longer=TRUE)
#> $Accuracy
#> # A tibble: 5 x 2
#>   Statistic            Value
#>   <chr>                <dbl>
#> 1 Accuracy          9.33e- 1
#> 2 Accuracy LL       8.81e- 1
#> 3 Accuracy UL       9.68e- 1
#> 4 Accuracy Guessing 3.33e- 1
#> 5 Accuracy P-value  3.36e-54
#> 
#> $Other
#> # A tibble: 56 x 3
#>    Class Statistic               Value
#>    <chr> <chr>                   <dbl>
#>  1 1     N                      50    
#>  2 1     Sensitivity/Recall/TPR  1    
#>  3 1     Specificity/TNR         1    
#>  4 1     PPV/Precision           1    
#>  5 1     NPV                     1    
#>  6 1     F1/Dice                 1    
#>  7 1     Prevalence              0.333
#>  8 1     Detection Rate          0.333
#>  9 1     Detection Prevalence    0.333
#> 10 1     Balanced Accuracy       1    
#> # … with 46 more rows
#> 
#> $Table
#>          Target
#> Predicted  1  2  3
#>         1 50  0  0
#>         2  0 40  0
#>         3  0 10 50

confusion_matrix(prediction, target, longer=TRUE) %>%
  purrr::pluck("Other") %>%
  tidyr::spread(Class, Value)
#> # A tibble: 14 x 5
#>    Statistic                 `1`     `2`    `3` Average
#>    <chr>                   <dbl>   <dbl>  <dbl>   <dbl>
#>  1 Balanced Accuracy       1      0.9     0.95   0.95  
#>  2 Detection Prevalence    0.333  0.267   0.4    0.333 
#>  3 Detection Rate          0.333  0.267   0.333  0.311 
#>  4 F1/Dice                 1      0.889   0.909  0.933 
#>  5 FDR                     0      0       0.167  0.0556
#>  6 FNR                     0      0.200   0      0.0667
#>  7 FOR                     0      0.0909  0      0.0303
#>  8 FPR/Fallout             0      0       0.100  0.0333
#>  9 N                      50     50      50     50     
#> 10 NPV                     1      0.909   1      0.970 
#> 11 PPV/Precision           1      1       0.833  0.944 
#> 12 Prevalence              0.333  0.333   0.333  0.333 
#> 13 Sensitivity/Recall/TPR  1      0.8     1      0.933 
#> 14 Specificity/TNR         1      1       0.9    0.967

signaturescience / skater

Confusion matrix #25