chjackson / flexsurv

The flexsurv R package for flexible parametric survival and multi-state modelling
http://chjackson.github.io/flexsurv/
55 stars 27 forks source link

coef() bug #19

Closed krz closed 7 years ago

krz commented 7 years ago

I found a strange behaviour in the coef() function of a flexsurvreg object. Here is a minimal example:

library(flexsurv)
df <- data.frame(y = rgamma(100, 0.7),
                 x1 = rnorm(100),
                 x2 = rnorm(100))
m <- flexsurvreg(Surv(y) ~ x1 + x2, data=df, dist = "gengamma")

The output of m is:

Call:
flexsurvreg(formula = Surv(y) ~ x1 + x2, data = df, dist = "gengamma")

Estimates: 
       data mean  est       L95%      U95%      se        exp(est)  L95%      U95%    
mu           NA    0.06233  -0.47074   0.59539   0.27198        NA        NA        NA
sigma        NA    1.05173   0.75117   1.47255   0.18060        NA        NA        NA
Q            NA    1.89772   0.90701   2.88843   0.50547        NA        NA        NA
x1     -0.04313   -0.04991  -0.23931   0.13950   0.09664   0.95132   0.78717   1.14970
x2      0.05731   -0.00514  -0.25736   0.24708   0.12869   0.99488   0.77309   1.28029

When I use coef, I get a different estimate for sigma (all others are the same)

coef(m)
          mu        sigma            Q           x1           x2 
 0.062326010  0.050440907  1.897718844 -0.049907137 -0.005137361 

sigma estimate is 0.05 here, was 1.05 above.

This may be a bug.

jrdnmdhl commented 7 years ago

I believe coef returns the transformed parameters. Gen. gamma uses a log transformation of sigma. Ln(1.05173) ~= 0.05044.

krz commented 7 years ago

thanks, that explains it, but I am unsure if this is intended.

chjackson commented 7 years ago

It's maybe half-intended, but having thought about it, I think this behaviour is OK. It's consistent with other kinds of regression models in R where coef() returns both the intercept and the regression coefficients on the transformed scale, e.g. in glm(), baseline log odds and log odds ratios in a logistic regression. In flexsurv, every parameter can have a regression model on it, even auxiliary parameters like sigma in the generalized gamma.

I'm not sure where would be the best place to document this - as it just uses stats:::coef.default and returns the $coefficients component of the model object - I've added a note to help(flexsurvreg), section "Value" to explain this component.

jrdnmdhl commented 7 years ago

One way to handle documenting it would be to write a coef generic (even if it simply calls coef.default) and document it from there.

chjackson commented 7 years ago

That sounds reasonable - done in latest commit.

Note R CMD check currently doesn't work as it's awaiting an update to the mstate package (which in turn was broken by a major update to the survival package)