derrynknife / SurPyval

A Python package for survival analysis. The most flexible survival analysis package available. SurPyval can work with arbitrary combinations of observed, censored, and truncated data. SurPyval can also fit distributions with 'offsets' with ease, for example the three parameter Weibull distribution.
https://surpyval.readthedocs.io/en/latest/index.html
MIT License
48 stars 5 forks source link

Gumbel distribution #31

Closed lisandrojim closed 1 year ago

lisandrojim commented 1 year ago

I was giving a look at the survival function of the Gumbel distribution, and I found that it is defined as:

S(x) = np.exp(-np.exp((x - mu)/sigma))

Shouldn't it be:

S(x) = 1 - np.exp(-np.exp(-(x - mu)/sigma))

I changed it manually in the code, and when I fitted the function the problem was not solved (i.e., the fitting is still done with the first equation shown above). Any suggestions on what to do?

derrynknife commented 1 year ago

You might have to change the ff function as well

Thanks for picking that up!

lisandrojim commented 1 year ago

Updating the ff function solved the problem. However, the values of the hazard function h(x) and cumulative hazard H(x) need to be adjusted as well as follows:

Hazard function: h(x) = (1/sigma) * np.exp(- (x - mu) / sigma) Cumulative hazard function: H(x) = np.exp(-(x - mu)/sigma)

The quantile function (qf) for the Gumbel Distribution still needs to be adjusted so that the random function works correctly. I hope you can update the code with these changes.

derrynknife commented 1 year ago

Could you do a pull request?

lisandrojim commented 1 year ago

Hello Derryn,

I just made the pull request. I never did it before, so I hope it worked just fine. Let me know otherwise.

Cheers, Lisandro

On 25 Nov 2022, at 11:14 PM, Derryn Knife @.***> wrote:

Could you do a pull request?

— Reply to this email directly, view it on GitHub https://github.com/derrynknife/SurPyval/issues/31#issuecomment-1327911803, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGXDT3TZJ2RYQY4SBZB2EKTWKE22XANCNFSM6AAAAAASLELLVM. You are receiving this because you authored the thread.

derrynknife commented 1 year ago

G'Day Lisandro.

I've just had a look at this, and in a way, we are both correct.

The current implementation of the Gumbel distribution in surpyval, is actually the "Smallest Extreme Value" version of the Gumbel distribution. See here for information regarding it. The smallest extreme value distribution is more useful in survival analysis as it captures a weakest link type of failure.

The implementation of the Gumbel you are suggesting is, correctly, a "Gumbel" distribution, although it is the "Largest Extreme Value" version of it. If you want, and instead of changing the current implementation, we can add a new distribution that captures this. Perhaps we can call it, "LEV" or Gumbel2 or something?

I'll close this issue, but please feel free to raise another with your preferred direction.

lisandrojim commented 1 year ago

Hello Derryn,

Thanks for the clarification.

There are two possibilities I see:

  1. Rename the current gumble.py script to e.g., gumbel_lev.py, and create a new script called gumble.py with the changes I suggested. Note: Please also ensure that all the remaining functions are working accordingly.

or.

  1. Add an additional input variable to the current gumbel.py function, where the user can choose between the distribution's extreme maximum or minimum value.

I think the easiest is option 1. Let me know if I can help you with this.

Regards, Lisandro

On 27 Nov 2022, at 5:26 AM, Derryn Knife @.***> wrote:

G'Day Lisandro.

I've just had a look at this, and in a way, we are both correct.

The current implementation of the Gumbel distribution in surpyval, is actually the "Smallest Extreme Value" version of the Gumbel distribution. See here https://reliawiki.org/index.php/The_Gumbel/SEV_Distribution for information regarding it. The smallest extreme value distribution is more useful in survival analysis as it captures a weakest link type of failure.

The implementation of the Gumbel you are suggesting is, correctly, a "Gumbel" distribution, although it is the "Largest Extreme Value" version of it. If you want, and instead of changing the current implementation, we can add a new distribution that captures this. Perhaps we can call it, "LEV" or Gumbel2 or something?

I'll close this issue, but please feel free to raise another with your preferred direction.

— Reply to this email directly, view it on GitHub https://github.com/derrynknife/SurPyval/issues/31#issuecomment-1328169013, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGXDT3XO44VJMOXXHXS4Q2DWKLPI3ANCNFSM6AAAAAASLELLVM. You are receiving this because you authored the thread.