tidymodels / yardstick

Tidy methods for measuring model performance
https://yardstick.tidymodels.org/
Other
370 stars 54 forks source link

Precision() NA returned in multiclass call . Warning instead? #98

Closed oliverroc closed 5 years ago

oliverroc commented 5 years ago

Hello,

Yestarday I installed yardstick and I was using it for a classification problem I am working with. I have no more than 400 hundred categories to classify.

I was using the precision and recall functions , the first one returned .estimate=NA and the second one 0.7. I think this might be because I have some true categories with none correct prediction, but I am not sure...

If the returned NA is because of that, it wouldn't be better to compute the statistics with na.rm=TRUE and raise a warning with a list of the categories with none matching prediction?

Here I created a simple example to reproduce my case:

library(dplyr)
library(yardstick)

test <-data.frame(real=c(1,2,3,4,5,6,7,8,9,10),
                  pred=c(1,2,3,4,5,6,9,1,6,10))
test<-test %>%  mutate(real=factor(real,levels = c(1,2,3,4,5,6,7,8,9,10)),pred=factor(pred,levels = c(1,2,3,4,5,6,7,8,9,10)))
test<-as_tibble(test)
table(test)
    pred
real 1 2 3 4 5 6 7 8 9 10
  1  1 0 0 0 0 0 0 0 0  0
  2  0 1 0 0 0 0 0 0 0  0
  3  0 0 1 0 0 0 0 0 0  0
  4  0 0 0 1 0 0 0 0 0  0
  5  0 0 0 0 1 0 0 0 0  0
  6  0 0 0 0 0 1 0 0 0  0
  7  0 0 0 0 0 0 0 0 1  0
  8  1 0 0 0 0 0 0 0 0  0
  9  0 0 0 0 0 1 0 0 0  0
  10 0 0 0 0 0 0 0 0 0  1

precision(test, real, pred,na_rm = TRUE) 

# A tibble: 1 x 3
  .metric   .estimator .estimate
  <chr>     <chr>          <dbl>
1 precision macro             NA

precision(test, real, pred,na_rm = FALSE) 

# A tibble: 1 x 3
  .metric   .estimator .estimate
  <chr>     <chr>          <dbl>
1 precision macro             NA

recall(test,real,pred) 

# A tibble: 1 x 3
  .metric .estimator .estimate
  <chr>   <chr>          <dbl>
1 recall  macro            0.7

Thanks,

sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.2 LTS

Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
 [1] LC_CTYPE=es_ES.UTF-8       LC_NUMERIC=C               LC_TIME=es_ES.UTF-8        LC_COLLATE=es_ES.UTF-8    
 [5] LC_MONETARY=es_ES.UTF-8    LC_MESSAGES=es_ES.UTF-8    LC_PAPER=es_ES.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] yardstick_0.0.3 dplyr_0.8.0.1  

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.0       fansi_0.4.0      utf8_1.1.4       crayon_1.3.4     assertthat_0.2.0 plyr_1.8.4       R6_2.4.0        
 [8] magrittr_1.5     pROC_1.14.0      pillar_1.3.1     cli_1.0.1        rlang_0.3.1      rstudioapi_0.9.0 generics_0.0.2  
[15] tools_3.4.4      glue_1.3.0       purrr_0.3.0      yaml_2.2.0       compiler_3.4.4   pkgconfig_2.0.2  tidyselect_0.2.5
[22] tibble_2.0.1    
DavisVaughan commented 5 years ago

Adding a number of references for this because there isn't a simple answer of what the result should be. Note that this affects all 3 of precision/recall/f_meas:

https://stats.stackexchange.com/questions/8025/what-are-correct-values-for-precision-and-recall-when-the-denominators-equal-0

Suggests 1: https://github.com/dice-group/gerbil/wiki/Precision,-Recall-and-F1-measure

Suggests 1: https://stats.stackexchange.com/questions/1773/what-are-correct-values-for-precision-and-recall-in-edge-cases

sklearn returns 0: https://github.com/scikit-learn/scikit-learn/blob/b4c1c4ed833db5b0fbff0d110b040a34a84e1411/sklearn/metrics/classification.py#L1198

I don't think NA is ideal to return because in macro averaging you might compute multiple precision values and only have 1 crap out on you. So you might not want to NA the entire averaged score because of this.

github-actions[bot] commented 3 years ago

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.