Open markdanese opened 11 months ago
Thanks for the report. The default newdata
that the flexsurvreg predict method uses is the "model frame"
that is created in this line of flexsurvreg.R. When run with a ns()
formula, this line seems to put the basis variables into the model frame, rather than the original covariate values that we want. I haven't used ns
and the like, so I can't see a quick fix. I will leave this open.
This is proving tricky to handle. The function stats::get_all_vars
seems like it would be useful here, as it is designed to extract the original variables supplied to a formula, whereas stats::model.frame
extracts the transformed versions. However get_all_vars
fails in cases where the formula contains a data frame look-up, e.g. compare
get_all_vars(ovarian$futime ~ 1, data=NULL) # fails
model.frame(ovarian$futime ~ 1, data=NULL) # works
I really appreciate this package. It makes things much easier, particularly with regard to generating causal contrasts and getting reasonable variance estimates.
I ran into an issue trying to get
standsurv()
to work when using a natural spline from thesplines
package. In this case it was age as a predictor in a model of time to death (in lung cancer). When age was used as a simple continuous variable,standsurv()
worked fine without needing to specify the data set. When I changing to a natural spline (ns(age, 2)
) to handle some non-linearity in increasing risk with age, I got the error that it could not find the variableage
. Helpfully, the error message suggested I should specify "newdata".I noted that the model object includes the transformed age (i.e., in this case with 2 spline terms), so the error makes sense -- age is not there. And when I specified newdata = the original dataset, it worked without an error.
I am guessing that the predict function (which I think is part of
summary()
) isn't built for this use case. I tried to see how to work around this and suggest a code change, but I couldn't find anything helpful.The simple workaround is to explicitly specify the original dataset, so it is not a critical issue. However, I wanted to put this out there in case anyone runs into this.