Closed jgellar closed 7 years ago
Thanks for figuring this out, Jonathan, makes a lot of sense to me.
But what happens to the smooth object, which we still need to keep track of? I guess we still need to handle this by passing xt() an environment (and an index) where the smooth object goes?
That seems unavoidable as long as xt()
only takes an x
argument and nothing else.
If we do this, then within xt() we can just search the environment for smooth, and if it is already present, call PredictMat instead of smoothCon - similar to what we are doing now. In this case, I don’t think we even have to write a makepredictcall() method, right?
Seems so.
It's great that all of this stuff seems to work now, but I'm also starting to think it might have been a better design choice to do the entire formula parsing / model matrix setup in our own code and basically only use coxph.fit
instead of calling coxph
and handing over all these intricate closures to make sure the terms get evaluated correctly. That would have probably meshed well with using makepredictcall
as well. Too late now.... I really should have seen this coming, after having lots of similar issues when developing pffr
by wrapping around gam
instead of gam.fit
:frowning:
That seems unavoidable as long as xt() only takes an x argument and nothing else.
In the new plan, I think xt
would have to take other arguments (see the example above) - basically all the arguments that create.xt.func
takes, plus x
. btw, I'm not sold on xt()
being the best name for this function - I named it this for our "internal" tracking of the functions that pre-process the covariates (without involving t
), because it was analogous to what the tt
function did. But now with this function becoming part of the actual coxph
formula, we might want to rename it. Even though the user never has to touch this function.
I tend to agree with you regarding coxph.fit
vs. coxph
. Then again, part of the advantage of wrapping around coxph
is that coxph
supports a lot of stuff I don't want to have to worry about, e.g. strata/clustering, all sorts of different ways of calculating SE's and handling ties, etc. Plus in theory we should be able to use the coxph
methods if Therneau implements this tt
thing.
On that note - what do you think the next steps were if we were going to actively work with him? Should we try to write something to get the predictions working with tt
and "submit" it to him? He briefly mentioned authorship - which would be pretty cool. Maybe one of us should write to him and clarify future steps?
FYI, I'll be out on vacation all next week.
Sorry, yes, of course.
On that note - what do you think the next steps were if we were going to actively work with him? Should we try to write something to get the predictions working with tt and "submit" it to him? He briefly mentioned authorship - which would be pretty cool. Maybe one of us should write to him and clarify future steps?
That's a good idea. Let's discuss this further after we get back from vacation -- mine starts tomorrow, I'll be back September 2.
In the meantime, should I continue to push forward with what I was doing as if we never had the conversation with Therneau? The full "To Do" list is in #30, but I don't think all of this needs to be done for the initial release. Right now, the main things remaining that are necessary are to get something working to extract the baseline hazard function (survfit.pcox
), run some more checks on the predict/cross-validation code, and clean up some documentation.
Note that the first two of those things are difficult only because of these tt
/makepredictcall
issues. If we got this working within coxph
, then would probably be able to use predict.coxph
and survfit.coxph
directly. I just think it would take more time before Therneau is able to implement these changes.
I can't tell how hard it would be to write a makepredictcall.tt
method that works for us, or whether that's all that's missing, so I don't know what's the best strategy here.
Aren't there other strategies as well, though:
Isn't it true though that once we have a working predict
-method, we could use that for generating the quantities that survfit
needs?
Or, on a lower level, if we had a working model.frame.pcox
(or rather model.matrix.pcox
...) method, that that would make both predict.coxph
and survfit.coxph
work for us?
Either would of course have to implement the functionality of a makepredictcall.tt
, but we can tailor it more to our own implementation this way since we can do the computation based on the pcox
-object instead of inside a call to a coxph
-method.
I just think it would take more time before Therneau is able to implement these changes.
I agree.
I can't tell how hard it would be to write a makepredictcall.tt method that works for us, or whether that's all that's missing, so I don't know what's the best strategy here.
Right now the tt
function is executed after model.frame
is called - meaning that the result of the tt
call (i.e., an object that we can say is class "tt"
), won't ever get to makepredictcall.tt
as currently constructed. There would need to be changes to both coxph
and predict.coxph
to make this work. Therneau clearly had interest in doing this though.
Isn't it true though that once we have a working predict-method, we could use that for generating the quantities that survfit needs?
We have a working predict
method, and yes I'm using it in survfit.pcox
. The actual survival calculations are handled by survfitcoxph.fit
- it's just a bit annoying to set up the input to this function, but I'm working through it.
Or, on a lower level, if we had a working model.frame.pcox (or rather model.matrix.pcox...) method, that that would make both predict.coxph and survfit.coxph work for us?
If/when we go to the model.frame
system, I think we should just do it for coxph
. That way we don't have to figure the same thing out twice. I'm pretty sure once coxph
has this implemented, we can just use it. Also, doesn't this only work if we are calling model.frame
within pcox
? Since it's being called within coxph
, I'm not sure if we'll have all the control we need.
We have a working predict method, and yes I'm using it in survfit.pcox.
Sorry, I got confused with that
if (missing(newdata)) {
newdata <- something
}
placeholder in survfit.pcox
that gives errors. Of course we have a working predict
, I'm stupid.
Also, doesn't this only work if we are calling model.frame within pcox?
I don't think so. As long as the object that the model.frame
-generic is called on has "pcox"
before "coxph"
in its class-attribute, method dispatch will first try to find&use the model.frame.pcox
method, regardless from where the generic is called.
I'm stupid.
You said it, not me.
I don't think so. As long as the object that the model.frame-generic is called on has "pcox" before "coxph"in its class-attribute, method dispatch will first try to find the model.frame.pcox method, regardless from where the generic is called.
I think I understand - you are refering to when model.frame
is called from within predict.coxph
(or predict.pcox
) - which doesn't have to be the same model.frame
method that is called from within the coxph
call. I guess that could work. I'll admit I am a bit new to the exact use of model.frame
/model.matrix
, and this may end up being more difficult than the plan of modifying survival
functions directly.
this may end up being more difficult than the plan of modifying survival functions directly.
not sure. i'm off, gotta pack, enjoy your vacation!
Jonathan, is this issue still current? Do we need this for the CRAN release?
No, I don't think we need this for the CRAN release. In particular because IIRC we would have to make some recommendations to Therneau so he can make his changes to the survival package, and we have no control over how long that would take.
So I'm closing this because it doesn't seem like something we'll work on for the foreseeable future.....
I’m trying to figure out how we could make
makepredictcall()
work forpcox
. Here is what I’ve learned about how it works:makepredictcall()
is called withinmodel.frame.default
. The first timemodel.frame.default
is called (i.e., when the model is fit), thepredvars
attribute of the formula will beNULL
. In this case, the columns of the model frame (i.e., the “variables”) are created by evaluating the “terms” calls.makepredictcall()
is called on those variables. The result is a new call, which is stored as thepredvars
attribute of the formula.predict()
is called on the model, model.frame.default is called again. This timepredvars
is not null. So instead of evaluating the “terms” call, it evaluates thepredvars
call.Now for
pcox
…My initial assessment is that this will be a lot of work, and isn’t worth doing now because (a) we are pretty close to release as is, and (b) Therneau would need to implement some changes to the
tt
system incoxph
before this could even work. Assuming he follows through and does that though, it might be worth it in the end for us to do this because it will make all the methods magically work, perhaps more reliably/robustly to changes in thecoxph
code.Let’s first talk about the non time-varying terms, e.g. f(age), because these are the only ones we can even consider until (b) above is taken care of. What we want to happen is, the first time
model.frame.default
is called, the “terms” call has to return acoxph.penalty
object. We also need to setpredvars
- this should be a call that ends up callingPredictMat()
on the smooth object. However,PredictMat
needs adata
argument when it is executed, andmodel.frame.default
won’t know to send the data in asdata
. So we need to create a wrapper function that sets up the data, and callsPredictMat
with that argument.predvars
will be a call to this function. Luckily, we have a system to do this already - these are ourxt
functions.Right now, all of the processing of these terms occurs within
pcox
, beforecoxph
is ever called. We convert the input data into acoxph.penalty
object, and this is whatcoxph
sees. So this won’t work withmakepredictcall
- thesmooth
object never even makes it intocoxph
, sopredvars
wouldn’t be able to be set up right.What we would need to do is have all the processing occur within
coxph
. I think this means that thecoxph
call that we create inpcox
will have to have terms in it that are function calls essentially do what our currentxt
functions do. e.g., if we call this functionxt()
, then thecoxph
call is something likeCurrently, we create the
xt
functions dynamically, so we don’t actually have a functionxt()
in the package - we usecreate.xt.func()
. This was done so the only argument of thext
function we create isx
, so that it mimics what we do fortt
functions (which are restricted to only have argumentsx
andt
). This would have to change for the new implementation. Instead of creating thext
function dynamically, inpcox
we set up a call to our newxt()
, which does exactly what the dynamically-createdxt
functions did before.So the first time
model.frame.default
is called, thatxt
term will be executed and return acoxph.penalty
term. That term will have to be of classxt
(inheriting fromcoxph.penalty
), so we can write amakepredictcall.xt
method. But what happens to the smooth object, which we still need to keep track of? I guess we still need to handle this by passingxt()
an environment (and an index) where thesmooth
object goes? If we do this, then withinxt()
we can just search the environment forsmooth
, and if it is already present, callPredictMat
instead ofsmoothCon
- similar to what we are doing now. In this case, I don’t think we even have to write amakepredictcall()
method, right?Let me know what you think (especially about this last part re: the smooth object).