Closed Vincent-Maladiere closed 1 year ago
Let's add some tests to check some mathematical properties:
IPCWEstimator
should constantly predict 1.0 when the training data has no censored values,assert_allclose(ipcw.predict([0]), [1.0])
when ipcw
is fit on random data, even when the shorter time in the training set is censored.ipcw.predict([t + eps])
is strictly larger than 1.0 when ipcw is fit on random
censored data and t is the smallest censored duration in the training set.IPCWEstimator
predicts monotonically increasing values on a np.linspace(0, t_max, 100)
when fit on random censored event data.1 / ipcw.min_censoring_survival_prob
for different values of min_censoring_survival_prob
.Ok @Vincent-Maladiere I think this is ready for final pass of review on your end.
I rendered the HTML of the doc locally but I realize that we do not have a CI job to do it for the PR. Only for publishing the result in the merge of the PR.
I think the easiest way to achieve this would be to configure a circle CI job for this project. Let's do that later in a separate PR.
What does this PR implement?
This is the implementation of the Brier score metrics module:
BrierScoreComputer
to be moved to the GBIncidence source codeBrierScoreSampler
brier_score
brier_score_incidence
integrated_brier_score
integrated_brier_score_incidence
This also implements the
IPCWEstimator
used by the BrierScoreComputer. We have a dependency on lifelines but not on scikit-survival.Remarks
- This fixes a subtle bug in~~Edit:y_binary
, making our previous implementation ofbrier_score()
incorrect. It now matches scikit-survival results.y_true_binary
is 1 when we observed an event, and 0 otherwise, which is correct for our Gradient Boosting classification or regression target. However, when computing the Brier Score metric, we actually neednp.logical_not(y_true_binary)
. Indeed, when we observed an event, the survival probability drops to 0. Otherwise, it stays at 1. The Brier Score uses the survival probability, not the incidence probability.~~brier_score
computes is the regular Brier Score like scikit-survival does, where they_pred
input is the survival probability. On the contrary,brier_score_incidence
follows the Kretowska formula for competing events and expectsy_pred
to be the incidence probability for the kth cause of failure instead. Under the hood though,brier_score
relies onbrier_score_incidence
, by simply inputting1 - y_pred
.This is needed to disambiguate our Brier Score implementation. We could also only expose
brier_score_incidence
without the regularbrier_score
, but I thought it was handy to have it, instead of telling the user to perform1 - y_pred
for binary events, all the time.When running
times
must be the times used for computingy_pred
—of shape(n_samples, n_times)
— otherwise, the Brier Score will be incorrect. This is tricky to check wheny_pred
is not a dataframe, so we only check for shapes.