Open sieberts opened 4 years ago
Thanks @sieberts . Could you provide a link to the code here?
There's probably a more efficient way to code this, but I did this quickly many years ago. auc_functions.R.zip
Thanks. I will have @mjrmason chime in here on what ROC
we use.
Most recently in ctd chemosensitivity we used limma
s auroc function.
On Mon, May 11, 2020, 9:19 AM Thomas Yu notifications@github.com wrote:
Thanks. I will have @mjrmason https://github.com/mjrmason chime in here on what ROC we use.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Sage-Bionetworks/challengescoring/issues/19#issuecomment-626804711, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE3WNSGNYF3WDFHWBUN52CTRRAQRHANCNFSM4M6A76JA .
Gustavo's algorithm is pretty slow, but it more accurately estimates the AUC when participants can submit binary OR probability predictions. If participants only generate probability predictions, it is not worth running.
Are we talking about integrated or regular AUC? I can't tell from you thread. For integrated, all I remember is you have to be careful since there is a package yields poor/inconsistent results the one would easily use by accident. The Multiple Myeloma Challenge ran into this. Here is my code for the iAUC. I believe the timeROC() function is from a different package is what is problematic. This one is good.
The code I posted is Gustavo's algorithm for integrated AUC. I'm not sure what's in the repository currently from the pROC package.
Gustavo's algorithm is pretty slow, but it more accurately estimates the AUC when participants can submit binary OR probability predictions. If participants only generate probability predictions, it is not worth running.
@sieberts This is really helpful info to have. Perhaps we can wrap both the function you provided and the limma function in this package and document the scenarios in which a user might want to pick one or the other.
Just to clarify, did you and @mjrmason provide (conceptually) the same iAUC function? Which should we include?
I think the code @sieberts provided is for a AUC and prAUC and handles ties better but it is not an integrated AUC. @sieberts please correct me if I am wrong.
@mjrmason -
Do you have a reference for what you're calling integrated AUC?
Hey @sieberts ,
Sorry for all the back and forth on this. Here is the reference. I also attached the pdf.... I think. Let me know if you can't access it and I'll email it.
The term "integrated" is super confusing since naturally one would use integration to find an area under a curve. The "integration" in "integrated AUC" or "iAUC" is referring specifically to survival models or something similar where different time points can be used to call a patient high risk, essentially turning a survival analysis problem it into a classification problem. In these situations you may not be sure if the time point you used is the best and maybe you would want to use 2 months later or earlier for example. The iAUC enables integration across a range of cut off times. It can be though of as averaging across AUCs with different time points used to classify your samples/patients. The R package I use for this is risksetROC and its function IntegrateAUC has a nice example.
As a side not if you where considering using a time range from 0 to the last observed point then the iAUC would just yield the concordance index. So the iAUC can be thought of as a special case of the concordance index narrowed to a specific range of time points.
Note: to use the iAUC you have to have AUC's computed for multiple cut off times. I use timeROC() from the timeROC package for this though you could do it "manually." There is an alternative package called survivalROC referencing the same Heagerty & Zheng paper that could be used for computing AUCs for for multiple cut off times but I found its survivalROC() function to produce very strange results sometimes so I tell people to avoid the package..
Let me know if you want to discuss. Apologies for this insanely long response.
Yes, that's definitely different. Gustavo's is an algorithm to interpolate AUC calculations when there are ties in the submission.
Not sure what algorithm for ROC is used, but Gustavo like the integrated AUROC/AUPR. I implemented a version in R that someone could cleanup and add if necessary.