xrobin / pROC

Display and analyze ROC curves in R and S+
https://cran.r-project.org/web/packages/pROC/
GNU General Public License v3.0
121 stars 31 forks source link

Summary ROC curves #98

Open wzbillings opened 3 years ago

wzbillings commented 3 years ago

Is your feature request related to a problem? Please describe. Recently, I've been doing a lot of resampling and fitting ROCs to a lot of bootstraps in order to bootstrap resample the AUC value as a prediction metric. I've been reading a bit about "SROC" curve (summary ROC) for "averaging" all of the ROCs together and it sounds like it could be an interesting feature for the package.

Describe the solution you'd like I would like to have a function called sroc.roc() or something similar that takes one or more pROC::roc() generated ROC curves and summarizes them into one ROC object (this would potentially require more attributes such as error, but I've not read enough about the method or implementation to know this yet; I can update this when I learn more).

This seems to involve interpolating between the x-axis values, since multiple ROC curves which need to be averaged are unlikely to have the same FPR points with different sensitivities.

Describe alternatives you've considered Of course doing something like bootstrapping and getting the mean of AUC is sufficient, or using LOESS to get an "idea" of a summary of ROC curves. But I'm not sure there's a true "alternative" for the SROC curve because of the issue requiring interpolation as described above.

Additional context See e.g. https://www.mwsug.org/proceedings/2010/health/MWSUG-2010-91.pdf https://www.onlinelibrary.wiley.com/doi/abs/10.1002/sim.1099

xrobin commented 3 years ago

Interesting! I wasn't aware of SROC curves.

From a first look it sounds to me like they will take single points from different studies. So the input wouldn't be

sroc(roc1, roc2, ...)

but more like:

sroc(point1, point2, ...)

Building a ROC curve from points in the SE/SP space/contingency tables is a feature that has been requested in the past, so there's definitely interest for that.

I assume you're also aware of the review by Fawcett which presents algorithms with pseudocode to average ROC curve (and handle the interpolation you mention above). Of note, the coords function is able to interpolate the curve properly (except the threshlds).

Most likely this will require a new type of ROC object, of class sroc, summary.roc or similar. It will be interesting to think what methods can be applicable (ie. can you calculate var(sroc), roc.test(sroc1, sroc2), etc.)