[feature] Maintaining inverse covariance matrix in state and applying updates

SheffieldML / GPy

Gaussian processes framework in python

BSD 3-Clause "New" or "Revised" License

2.02k stars 559 forks source link

[feature] Maintaining inverse covariance matrix in state and applying updates #812

Open ekalosak opened 4 years ago

ekalosak commented 4 years ago

Following basic matrix manipulations (read: nothing fancy, see e.g. Algorithm 1 in this source) the kernel matrix inversion might be maintained in state and updated incrementally. This would drastically reduce the time complexity of prediction steps.

I'm happy to tackle this myself, or if someone else takes it first, great - I'm writing this issue to ask for input on the best place to prototype this feature: should it go in GPy/core/gp.py ? Or somewhere else?

lawrennd commented 4 years ago

Hi Eric, This would be great. Are you thinking in the online case? However, be careful, the Woodbury based inverse formula is not great numerically ... this comes up a lot in EP ... it also comes up in Kalman filter implementations. The more stable update is done through a Cholesky update which is a bit more involved.

Matthias Seeger's thesis looks at this in the context of EP (where Cholesky updates and downdates are used ... but downdates are also numerically troublesome!) https://era.ed.ac.uk/bitstream/handle/1842/321/IP030012.pdf?sequence=1&isAllowed=y

ekalosak commented 4 years ago

It seems like downdates would be an additional feature - clearly useful but not a blocker for online updates - is that right? And, as long as the numerical instability is disclosed (and not abysmal), the Woodbury update seems like a good starting point - albeit a flawed one. I'll have to dig into Dr Seeger's thesis...

zhenwendai commented 4 years ago

Following basic matrix manipulations (read: nothing fancy, see e.g. Algorithm 1 in this source) the kernel matrix inversion might be maintained in state and updated incrementally. This would drastically reduce the time complexity of prediction steps.

What is the motivation of doing the incremental matrix inverse update?

For GPs, we often update the kernel parameters frequently, which results into a complete recomputation of the covariance matrix, therefore, an incremental matrix inverse is not plausible. I know such tricks are used in Bayesian linear models for online learning, in which the basis functions are not changed.