xrobin / pROC

Display and analyze ROC curves in R and S+
https://cran.r-project.org/web/packages/pROC/
GNU General Public License v3.0
118 stars 31 forks source link

case weights #96

Open topepo opened 3 years ago

topepo commented 3 years ago

It would be great to have the calculations for the curve take into account cases weights (i.e. a non-negative, numeric vector of values the same length as the other data objects).

xrobin commented 3 years ago

I agree this would be cool. Do you have a reference on how this is implemented in the context of ROC curves?

topepo commented 3 years ago

The curve would be based on the weighted versions of sensitivity and specificity.

library(tidymodels)
#> Registered S3 method overwritten by 'tune':
#>   method                   from   
#>   required_pkgs.model_spec parsnip

data(pathology)
str(pathology)
#> 'data.frame':    344 obs. of  2 variables:
#>  $ pathology: Factor w/ 2 levels "abnorm","norm": 1 1 1 1 1 1 1 1 1 1 ...
#>  $ scan     : Factor w/ 2 levels "abnorm","norm": 1 1 1 1 1 1 1 1 1 1 ...

set.seed(1)
pathology$weights <- runif(nrow(pathology))

event <- "abnorm"

unweighted <- 
  sum(pathology$pathology == event & pathology$scan == event) /
  sum(pathology$pathology == event)
unweighted
#> [1] 0.8953488

# via yardstick:
sensitivity(pathology, pathology, scan)
#> # A tibble: 1 × 3
#>   .metric .estimator .estimate
#>   <chr>   <chr>          <dbl>
#> 1 sens    binary         0.895

weighted <- 
  sum( pathology$weights * (pathology$pathology == event & pathology$scan == event) ) /
  sum( pathology$weights * (pathology$pathology == event) )

weighted
#> [1] 0.9013333

Created on 2021-09-13 by the reprex package (v2.0.0)

@davisvaughan has the start of changes that we will be making to yardstick here

xrobin commented 3 years ago

I think I see. The easiest would be to directly update the roc.utils.perfs.all.fast to calculate TP/FP taking the weights into account:

  tp <- cumsum(response.sorted==1 * weights.sorted)
  fp <- cumsum(response.sorted==0 * weights.sorted)

A few thought on the implementation:

aminadibi commented 3 months ago

I'd love this feature too. WeightedROC package does it, but that package doesn't produce CIs.