pcox return object - Githubissues

jgellar commented 9 years ago

Hi Fabian,

In that "notes" file that you sent with your initial impressions of pcox, you wrote that one of the pffr "mistakes" to avoid would be:

stronger modification of the return object instead of simply adding more stuff to the object returned by mgcv: easier to write methods, more memory efficient (pffr objects are often huge because of lots of duplication)

Could you be more specific what you mean by this? For our purposes, we would be modifying/adding to the object returned by coxph, but it's the same idea. My original plan was to take the object returned from coxph, and add a "pcox" element to it, which contains all of the "extra" stuff from pcox. This includes:

the pcox formula
the trmmap and labelmap variables
the "where" variables
the list of smooth objects (needed for methods)
etc.

Are you saying you don't think this is the best plan, and if not, what do you propose instead?

fabian-s commented 9 years ago

AFAICS, in designing this, the trade-off is between robustness against coxph-changes & smaller initial design effort versus elegance/maintainability & parsimony.

You'll get some robustness against internal coxph-changes and avoid a lot of design work if you just dump all the additional info for the pcox-model into the coxph-object and let the resultig conglomeration inherit from pcox , coxph.penal, and coxph, similar to what I did for pffr. That means some of the methods defined in survival will still work, and you overwrite those that don't work (survfit, e.g.) or don't exist (plot, e.g.) by pcox-methods. It also means that

pcox-methods have to work with/around/against the shape of the coxph-object that was not really designed for this type of model, potentially leading to convoluted/messy code (take a look at predict.pffr to see what I mean)
and/or some of the info in the return object will be redundant as the new methods rely on lots of additional information in the pcox-appendix to the coxph-object.

The other option is to define your own return object from scratch. That means you'd have to think hard about what you want to be able to do with that model once it's fit and how to achieve that. It means you can avoid duplicating data or design matrices in the pcox return object. It also means you can shape the return object so that it is easier to write clear & concise plot/predict/summary etc methods for it.

This is more attractive for coxph than it was for the gam-object underlying pffr, I think, as there are much fewer methods available for coxph- than there are for gam-objects, so you lose less by doing your own thing.

To sum up, I'm not proposing to go for one or the other, I'm saying that this is an important design choice and how I see it. The sweet spot is probably somewhere in the middle -- don't do everything from scratch, but also don't just dump lots of additional info into the model returned by coxph.

fabian-s commented 7 years ago

I'm closing this since I think it's not realistic to switch strategies re. the structure of the return object now with all the methods you've already written for it...

jgellar / pcox

pcox return object #3