CamDavidsonPilon / lifelines

Survival analysis in Python
lifelines.readthedocs.org
MIT License
2.37k stars 560 forks source link

Instability in the Weibull fits #827

Open amirhosseindavoody opened 5 years ago

amirhosseindavoody commented 5 years ago

I have been testing this package to use this package as a replacement for JMP software to fit Weibull distribution on a bunch of interval censored data. The results does not seem to be stable. When I compare the results with JMP results, always the log likelihood is larger with JMP results. I have narrowed down the root cause to some extent.

There are two problems as far as I can tell:

To have a better solution (this definition is very subjective here), I have figured that it is better to optimize the scale_factor in log space, meaning using log(scale_factor) as the optimization factor, and also use 1/shape_factor as the optimization factor.

@CamDavidsonPilon I was wondering if you have any suggestions or thoughts on getting a more stable result in these cases.

Thanks,

CamDavidsonPilon commented 5 years ago

Hi @amirhosseindavoody, I'm glad you reported this issue. I'm happy to investigate this and come up with a better solution. To confirm, you are using WeibullFitter.fit_interval_censoring, correct?

CamDavidsonPilon commented 5 years ago

For your first example, the following Python code and R code agree:

wf.fit_interval_censoring(np.array([1, 10, 100]), np.array([10, 100, 1000]), event_observed=np.array([0, 0, 0]))
wf.print_summary()
library(icenReg)
ic_par(cbind(c(1, 10, 100), c(10, 100, 1000)) ~ 1)

Do you see lifeline's behaviour being something else, or JMP giving different values?

CamDavidsonPilon commented 5 years ago

Hello @amirhosseindavoody, have you had a chance to investigate this more? Can you share a real life dataset that has convergence problems?

amirhosseindavoody commented 5 years ago

Hi @CamDavidsonPilon , I have been very busy lately. I will find a real life dataset and post it here this week.

pistacliffcho commented 5 years ago

Just a thought: the issue may be statistical rather than numerical. That is, censored data models can suffer from the same issue as logistic regression when there is perfect separation in the data; the MLE of coefficients will be unbounded, although the optimization algorithm will converge with finite values that might not seem huge to some users, i.e., 20, since the change in log-likelihood may be numerically 0 at this point. A specific issue that a Weibull model will face is that the scale/rate parameter will go to 0/infinity at the MLE, and, like the coefficients, will never make it there.

From an algorithm's perspective, it's not so easy to detect this and it's usually considered a responsibility of the user (i.e., as is the case in glm in base R) to investigate if this could be an issue. Without seeing the data, that would be my first guess about what is going on.

CamDavidsonPilon commented 5 years ago

Also a very likely scenario, @pistacliffcho, thanks for mentioning it. For future users, one can turn on show_progress=True in the call to fit and it will display a small summary of convergence details - including if the maximum number of iterations has been exceeded (which is what I suspect would happen in the perfect separation case).

amirhosseindavoody commented 5 years ago

Okay, this is kind of over due and I have finally got time to do some more detailed analysis. Here is an example data set and the results from JMP and lifelines. example_fail_data.txt

I have used scaling factor of 5000 for the start and stop times. The data include right censored and interval censored samples. Also, there are multiple samples (categories) and the Weibull analysis is performed for each category separately. Below you can see the alpha_JMPversus alpha_lifelinesand beta_JMPversus beta_lifelines. The line shows the one-to-one correlation which we want in the ideal case (matching results between JMP and lifelines).

image

image

In my analysis, the JMP result is always better meaning they result in large log likelihood for each categories. I will show those results in another post.

@CamDavidsonPilon any ideas?