Jackknife+ Implementation

matteo-fontana commented 2 years ago

It would be useful to implement, as a prediciton method, the Jaccknife+ by Barber and coauthors (https://arxiv.org/abs/1905.02928)

paolo-vergo commented 2 years ago

@ryantibs

To implement jackknife+ we added a new function called conformal.pred.jackplus and also one more return argument to lm.funs , called update.fun.

Since the algoritm is quite computationally expensive, we implemented a series of workarounds:

we parallelised the compuation of models, Loo residuals and bounds with functions contained in the future.apply library
when the regression method is given by lm.funs, we exploited Sherman-Morrison formula to quickly update the models (by removing one observation each time), through the abovementioned update.fun argument
we used, if present, the special.fun argument to quicken the Loo residuals computation

The function takes as input: x Matrix of features, of dimension (say) n x p. y Vector of responses, of length (say) n. x0 Matrix of features, each row being a point at which we want to form a prediction interval, of dimension (say) n0 x p. train.fun A function to perform model training, as in the split conformal function. predict.fun A function to perform prediction for the (mean of the) responses at new feature values, as in the split conformal function. alpha Miscoverage level for the prediction intervals. special.fun A function to compute efficiently the leave-one-out residuals. Its existence varies in the different methods. update.fun A function to compute efficiently the models removing one observation of the training dataset. Default is NULL.

The function returns a list with the following components: pred, lo, up, split. They have the same structure as the return of conformal.pred.split.

We also built an example script, called ex.conformal.pred.jackplus, to test the code, similarly to the already present ex.conformal.split . Moreover we would like to point out that our implementation includes Roxygen headers, allowing for a faster rebuilding of the documentation and providing consistence with the existing package.

ryantibs commented 2 years ago

@paolo-vergo Sorry for the long delay here. Thanks for writing this out. Several questions/comments.

Wouldn't it make sense just to update the existing conformal.pred.jack() function rather than write a new one? It looks like the argument structure should exactly the same and a lot of the code should be the same as well, right?

So you could simply add an argument plus argument to conformal.pred.jack() to signify whether we want the usual jackknife or the jackknife+. We could make the default plus = TRUE, so that the default is jackknife+.
The parallelization you're describing could also be applied to the jackknife (not plus) option as well. I would be sure to make this fail gracefully if they don't have the requisite library---to be clear, if that library is not installed, then it just does the naive serial calculation.
Lastly, I'm confused why an update.fun return argument needs to be added to lm.funs(). What you're describing sounds like it's equivalent to what is returned in the special.fun return argument of lm.funs(). The latter already does the leave-one-out trick, right? Is there some reason that something more is needed here?

paolo-vergo commented 2 years ago

@ryantibs

Ok, we have implemented a modification of conformal.pred.jack() , integrating Jackknife+. The only doubt is how to handle the mad functions; in particular should we multiply the final quantiles for the mad predictions in x0 ? I hope it's clear!
Following your advice we implemented in the file common.R two functions called one.sapply and one.lapply , which work in parallel if the library future.apply is installed or else they proceed serially.
Since in Jackknife + we need the estimated values of the test point x0 for each leave-one-out model (the model obtained by removing one observation at the time), the update.fun is meant to directly update the estimated regression coefficients with the Sherman-Morrison formula, without explicity re-estimating the models. The performance improvement, however, is quite negligible. Therefore at the end of the day there might be no need for that.

ryantibs commented 2 years ago

Hi @paolo-vergo, me again, apologizing for such a long delay here. It's been really tough to find time to check-in and think about this. I'm really sorry to keep you guys waiting so long.

While we figure out what to do vis-a-vis jackknife+ with local-weighting, why don't you go head and submit a PR or at least a draft PR with this just to get the ball rolling.

I've sent out a message to try to see if I can get some help from collaborators / students / people in the conformal community in terms of maintaining this package so hopefully if we can find said people, they will be able to jump in, discuss issues, tend to pull requests, etc., and the whole thing won't be moving so slowly. Thank you for your patience and sorry again ...

paolo-vergo commented 2 years ago

Hi @ryantibs, as requested I have sent the PR for the Jackknife+ (without local weighting) !

ryantibs commented 2 years ago

Hi @paolo-vergo thinking about it and discussing it with my friend Rina (lead author on the jack+ paper) this should be the right way to do locally-weighted conformal with jackknife+:

lower and upper quantiles of

\hat\mu{-i}(X{n+1}) +/- \hat\sigma{-i}(X{n+1}) R_i

where R_i is the studentized leave-one-out residual

R_i = |Yi - \hat\mu{-i}(X{n+1})| / \hat\sigma{-i}(X_i)

Make sense?

paolo-vergo commented 2 years ago

Hi @ryantibs ! It makes sense to me. Therefore I have implemented the local weighting and committed the changes on my branch.

ryantibs / conformal

Jackknife+ Implementation #10