sebp / scikit-survival

Survival analysis built on top of scikit-learn
GNU General Public License v3.0
1.14k stars 216 forks source link

Cumulative Incidence (Kaplan-Meier) for competing risks. #491

Open mvlvrd opened 2 weeks ago

mvlvrd commented 2 weeks ago

Add non-parametric Cumulative Incidence (Kaplan-Meier) for competing risks. Tests included.

Checklist

What does this implement/fix? Explain your changes Implements cumulative incidence estimator for the competing risks case. Confidence Intervals are not implemented but are under development.

codecov[bot] commented 2 weeks ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 98.22%. Comparing base (41a5200) to head (6034592). Report is 6 commits behind head on master.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #491 +/- ## ========================================== + Coverage 98.21% 98.22% +0.01% ========================================== Files 37 37 Lines 3521 3556 +35 Branches 464 471 +7 ========================================== + Hits 3458 3493 +35 Misses 30 30 Partials 33 33 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.


🚨 Try these New Features:

mvlvrd commented 2 weeks ago

The codacy linter triggers for unused unpacked assignments, however the ruff linter doesn't. Perhaps it would be nice to make them consistent, though is not a big issue. (I can open a new issue/PR if needed, it is trivial to fix it from the ruff side).

mvlvrd commented 1 week ago

Thanks so much for the review! I think I have addressed most of your points. I have been looking for a real dataset with more than two risks, but I couldn't find one, so I added a simulated one. Corner cases like zero events for one (or more) particular risk are not included and raise an error.

I also changed the function name (removing the reference to Kaplan-Meier). This is because I found a couple of papers that use KM for the naive extension of KM to the competing risks case, while we are using the rigorous estimator that does not assume independence between risks.