SheffieldML / GPy

Gaussian processes framework in python
BSD 3-Clause "New" or "Revised" License
2.03k stars 561 forks source link

Calculating the Bayesian Information Criterion. #298

Closed elkeschaper closed 1 year ago

elkeschaper commented 8 years ago

I'm interested in calculating the Bayesian Information Criterion for model selection, i.e. to assess the fit for different kernels.

For a model of type _GPy.models.gpregression.GPRegression, my current approach is as follows:

def calculate_BIC(model):

    """
    Calculate the Bayesian Information Criterion (BIC) for a GPy `model` with maximum likelihood hyperparameters on a given dataset.
    https://en.wikipedia.org/wiki/Bayesian_information_criterion
    """
    # model.log_likelihood() is the natural logarithm of the marginal likelihood of the Gaussian process.
    # len(model.X) is the number of data points.
    # model._size_transformed() is the number of optimisation parameters.
    return - 2 * model.log_likelihood() + len(model.X) * np.log(model._size_transformed())

My questions now are:

Cheers!

Elke

lawrennd commented 8 years ago

Hi Elke,

Just from a theoretical perspective, although people do sometimes compute BIC for these models (often under pressure from reviewers) I think it doesn't make sense to. I believe implicit in BIC (and AIC) is the idea that the parameters you count should conditionally make the data independent. In other words your model should make an i.i.d. assumption about the stochasticity given the parameters. GPs don't do this, they assume correlation.

Feel free to include it (thanks for contributing!) but perhaps put a note to that effect on it.

Neil On 4 Feb 2016 14:43, "Elke Schaper" notifications@github.com wrote:

I'm interested in calculating the Bayesian Information Criterion https://enwikipediaorg/wiki/Bayesian_information_criterion for model selection, ie to assess the fit for different kernels

For a model of type _GPymodelsgpregressionGPRegression, my current approach is as follows:

def calculate_BIC(model):

"""
Calculate the Bayesian Information Criterion (BIC) for a GPy `model` with maximum likelihood hyperparameters on a given dataset
https://enwikipediaorg/wiki/Bayesian_information_criterion
"""
# modellog_likelihood() is the natural logarithm of the marginal likelihood of the Gaussian process
# len(modelX) is the number of data points
# model_size_transformed() is the number of optimisation parameters
return - 2 * modellog_likelihood() + len(modelX) * nplog(model_size_transformed())

My questions now are:

  • Is this already implemented, but I missed it?
  • Would you agree with this approach? Or are there better ways for generalization?

Cheers!

Elke

— Reply to this email directly or view it on GitHub https://github.com/SheffieldML/GPy/issues/298.

elkeschaper commented 8 years ago

Hi Neil,

you're making a very good point. Unfortunately, I do not know the best alternative for selecting from models with different numbers of hyperparameters (that is, when the full evaluation of the models' posterior probabilities is not feasible).

I'd be very happy to add BIC code snippet to the library.

Elke

lawrennd commented 8 years ago

@alansaul again, wouldn't Aki's work help here? On 4 Feb 2016 19:06, "Elke Schaper" notifications@github.com wrote:

Hi Neil,

you're making a very good point. Unfortunately, I do not know the best alternative for selecting from models with different numbers of hyperparameters (that is, when the full evaluation of the models' posterior probabilities is not feasible).

I'd be very happy to add BIC code snippet to the library.

Elke

— Reply to this email directly or view it on GitHub https://github.com/SheffieldML/GPy/issues/298#issuecomment-180004474.

mzwiessele commented 8 years ago

Any news, can we close?

majidaldo commented 7 years ago

there's a mistake. in the positive term, the log should be on the number of data points.

rohanchauhan commented 5 years ago

Hi, shouldn’t the log-likelihood which you are calculating be the log of MLE estimate?

ekalosak commented 1 year ago

Stale.