Use other regression algorithms besides the GP for logprobability predictions

Currently, as per Kandasamy+2015's BAPE algorithm, I use the GP in a regression framework to predict logprobabilities for use with likelihood-based inferences, e.g. MCMC. In principle, there is no reason not to use a different regression algorithm, e.g. random forest, to make the logprobability predictions given the training set {theta, y}. In this framework, the GP would still be used to select points by maximizing the utility function, thereby only building the training set in high-likelihood regions of parameter space, but the GP would not be explicitly used for posterior estimation.

In general, this would require the user to specify an algorithm, a way to train it, and a way to optimize its hyperparameters. Sklearn pipelines could be excellent for this task, however the question becomes how much work is placed on the user to implement this machinery for their runs. I imagine that a meta-model class, with train, predict, etc methods could be specified that can ingest either the GP or an sklearn-friendly object. I suppose using the superclass of sklearn estimators could make this tractable, especially with sklearn estimators of course, and would mostly require me writing an sklearn-like class to play nice with the george gaussian process. The fit method, for instance, could call approxposterior's hyperparameter optimization routines, while the predict method could call george's predict method to estimate means and variances of the conditional distribution.

dflemin3 / approxposterior

Use other regression algorithms besides the GP for logprobability predictions #39