Closed admash closed 2 months ago
This segfault has been externally verified for MacOS running on ARM as well. The provided output is attached here:
The issue appears to be with the return value when the iteration does not converge. I'll look deeper. Data sets where survreg does not converge are very rare.
The survreg code first fits a model with only intercept and scale, to use as starting estimates. That iteration is failing, which leads to invalid arguments for the C routine that fails. This first bit has never failed before, and I have no checks for that. That wil be easy to fix. Failure was guarranteed in your small data set (one one obs and 2 parameters), the bigger set is an interesting puzzle to understand.
Now fixed. There was an error such that step halving was not properly invoked if the trial loglik was infinite. Your data set leads to a particularly bad first Newton-Raphson step.
Thanks Terry. You remain firmly ensconced in my statistical pantheon.
Hi Terry,
Unfortunately, we are still experiencing segfaults. I have attached a .zip file with code and data to reproduce the crash.
Let me know if you would prefer that I make a new issue.
Two issues here. First, you have found a data set for which the initial (intercept, scale) model fit fails. The true solution for (intercept, log(scale)) is approx (4.5, -.39), From a starting estimate of (3.04, .0162) for the first Newton-Raphson step is (17.5, -130.4); the iteration never recovers. The second issue is that I don't have a check for whether that first step fails, and the infinite value that arises causes the further code to fall apart. I'll fix the second now. The data you sent will get added to my fail directory, and I'll have to work out a solution to the iteration problem. Likely a trust region method.
Hello!
I have run into a problem where repeated calls to
try(survreg(...))
that do not converge, are causing R to segfault. The number of calls necessary to produce a segfault depends on the size of the dataset.I have attached a .zip file with two minimal code examples that produce the crash. One uses a single row data frame, while the other uses an included ~11k data frame. The smaller data frame results in a segfault after ~250 calls, while the large data frame produces a segfault after about ~10 calls.
survreg-try-reprex.zip
The output logs for the two examples are attached here:
crash-01.log crash-02.log
I am running Arch Linux, with the following output from R.version:
Please let me know if you need further information.
-admash