CamDavidsonPilon / lifelines

Survival analysis in Python
lifelines.readthedocs.org
MIT License
2.38k stars 560 forks source link

Formula in docs #1192

Open CamDavidsonPilon opened 3 years ago

CamDavidsonPilon commented 3 years ago

there should not be a subscript i in the Xbar i on for Cox time varying formula.

Originally posted by @CamDavidsonPilon in https://github.com/CamDavidsonPilon/lifelines/discussions/1191#discussioncomment-200656

CamDavidsonPilon commented 3 years ago

Also https://lifelines.readthedocs.io/en/latest/Survival%20Regression.html#cox-s-proportional-hazard-model

Cryptojoyz commented 1 year ago

I have a question, why should I subtract the mean value in the Cox time varying formula? image For example, in this example, the actual mean value of var1 should be 0.3, but when calculating the matrix mean this value becomes 0.2. What is the actual meaning of the matrix mean?

CamDavidsonPilon commented 1 year ago

Subtracting the mean, or any value really, doesn't change the final inference. We subtract the mean internally for stability reasons during fitting.

Screenshot 2023-05-26 at 8 09 37 AM
Cryptojoyz commented 1 year ago

Thank you for your answer, but in results containing interaction items, this can cause interpretation headaches.For example, image β3 means the coefficient when prio takes 0, but after subtracting the mean, prio takes 0 means that prio is actually equal to the mean, then this mean should preferably have a practical meaning, while the mean of the matrix has no practical counterpart, which is just a value in mathematics.

CamDavidsonPilon commented 1 year ago

Subtracting the mean happens internally after all the interactions have been done. If you want to use interactions, you should use the forumla= kwarg.

Cryptojoyz commented 1 year ago

Sorry, I don't quite understand what is meant by subtracting the mean after the interaction term occurs, I used the equation below for the fit and when the results are interpreted, I would like to know if the value of X4 minus the mean is 0 or the original value of X4 is 0 when X4 takes 0?

ctv.fit(base_df, id_col="id", event_col="event", start_col="start", stop_col="stop", show_progress=True, formula="X1+X2+X3*X4") If it is similar to linear regression, then X3X4 should actually be the `(X3 - X3bar) (X4 - X4bar) `?

Cryptojoyz commented 1 year ago

Am I to understand that subtracting the mean is doing the centering? If I have centered the matrix (subtracted the mean) before fitting, is subtracting the mean invalid?

CamDavidsonPilon commented 1 year ago

X1+X2+X3*X4 goes in, and then we subtract the mean of each column.

If I have centered the matrix (subtracted the mean) before fitting, is subtracting the mean invalid?

I don't think it's invalid - it doesn't change the results (the Cox coefficients are the same whether I demean or not)! But if you centered before, and you have interaction terms, then your interpretation will change.

However if you are doing probability prediction, that's different.

Cryptojoyz commented 1 year ago

Thank you so much!I'm much clearer!By the way,

there should not be a subscript i in the Xbar i on for Cox time varying formula.

subscript i is still in the Xbar in the latest document.