Open opendataminer opened 8 months ago
For what it's worth: the apparent R/Python mismatch might have the source as the R/Stata mismatch, as described here.
I mention it only because when I flipped basehaz(cox_pp_00, centered=FALSE)
in your MWE to basehaz(cox_pp_00, centered=TRUE)
, the output was identical to centered=FALSE
:
> basehaz(cox_pp_00, centered=FALSE)
hazard time
1 0.5052761 2
2 0.8409462 3
3 1.0434840 6
4 1.2974621 7
5 1.5514402 8
6 2.0066161 9
7 2.0066161 14
8 2.0066161 17
> basehaz(cox_pp_00, centered=TRUE)
hazard time
1 0.5052761 2
2 0.8409462 3
3 1.0434840 6
4 1.2974621 7
5 1.5514402 8
6 2.0066161 9
7 2.0066161 14
8 2.0066161 17
The identical output, to me, suggests that centered
isn't touching the part of survival
's behavior that gives rise to the R/Stata differences, meaning that behavior's still in play to explain the R/Python discrepency.
A very simple test case that illustrate the result discrepancy, the param for x are identical for both cases, but the baseline cumulative hazard are different, especially the index t are of different values! My fault if it is due to any misusage of the 2 packages.
Python Version
Output
R Version
Output