Open Wieser145 opened 1 year ago
Good point! This definitely should be a function parameter to plot_mean_roc().
If you didn't have done it, just add the parameter level which could be for example 0.9 and then change: tpr_lower = ret_mean.tpr - 1.96 tpr_std / np.sqrt(n_samples) into: tpr_lower = ret_mean.tpr - (norm.ppf(1 - (1 - level) / 2) tpr_std / np.sqrt(n_samples))
But what I am not sure is, if this calculation of the confidence interval is right because in R with the rocit.ciRoc they are using a totally different approach for calculating the standard deviation of the true positive rates, I am not sure if we do bootstrapping, that we can expect them to be normally distributed. Rather they have an empirical (nearly binomial dist.)
In R they are using the following code:
var_term1 <- TPR (1-TPR)/pos_count # looks like sqrt(pq/n), but I am not sure what is meant with pos_count SE_TPR <- sqrt(var_term1 ) multiplier <- qnorm((1+level)/2) upper <- TPR + multiplier * SE_TPR
Any idea?
Thanks for investigating this. I need to sit down and check this myself. Unfortunately, I'm quite busy right now. I may find time next week or the week after.
I definitely made very lose assumptions for the bootstrapping approach. It can be that I was overoptimistic. I certainly used guidance in some reference implementation. But it's been a while to remember exactly which reference I used when I was working on this. And it's possible that the reference was using bootstrapping in a different context other than ROC analysis.
Happy to receive more suggestions, especially if you find that I implemented things completely wrong :)
Thank you and I am happy if I can help optimizing it
Until now you can not specify a different confidence Interval for the mean ROC curve, 95% is fixed, in future it would be great if there is an option to change it!
In the function plot_mean_roc():
if show_ci:
95% confidence interval