recall_score_bootstrap does not match tpr_score_bootstrap

Sensitivity, aka true positive rate, should be calculated consistently accross the library. I can understand that there will be slight differences when using bootstrap methods for calculating the confidence intervals, but not inconsistency like the one in this minimal working example:

from confidenceinterval.takahashi_methods import recall_score_bootstrap, precision_score_bootstrap
from confidenceinterval.binary_metrics import tnr_score_bootstrap, ppv_score_bootstrap, tpr_score_bootstrap
from numpy.testing import assert_allclose, assert_almost_equal

def get_samples_based_on_tfpn(tp, tn, fp, fn) -> tuple[list[bool], list[int]]:
    ground_truth = [1] * tp + [0] * tn + [1] * fn + [0] * fp
    predictions = [1] * tp + [0] * tn + [0] * fn + [1] * fp
    return ground_truth, predictions

tp_, tn_, fp_, fn_ = 679, 1366, 69, 69

y_true, y_pred = get_samples_based_on_tfpn(tp_, tn_, fp_, fn_)

sensitivity_, sensitivity_ci_ = recall_score_bootstrap(y_true=y_true, y_pred=y_pred, confidence_level=0.95, method='bootstrap_bca')
sensitivity, sensitivity_ci = tpr_score_bootstrap(y_true=y_true, y_pred=y_pred, confidence_level=0.95, method='bootstrap_bca')

assert_almost_equal(sensitivity, sensitivity_, decimal=3, err_msg=f"Sensitivity: {sensitivity} != {sensitivity_}")

AssertionError: 
Arrays are not almost equal to 3 decimals
Sensitivity: 0.9077540105738298 != 0.9367842418689877
 ACTUAL: 0.9077540105738298
 DESIRED: 0.9367842418689877

I have looked into the source code and identified several inconsistencies in the docstrincs, where the terms "sensitivity" and "specificity" were mixed arbitrarily, pointing at unchecked copypasting of code and that's where the error originates. I have no clue where the error lies, though.

jacobgil / confidenceinterval

recall_score_bootstrap does not match tpr_score_bootstrap #8