Implementation of Bayesian Information Criterion (BIC) [Question]

Hey,

I searched for an already existing implementation but could not find one. Therefore I tried implementing my own, but I am not sure if the implementation is correct, because I am having trouble identifying the number of parameters.

Below is my implementation. My problem has two dimensional input, and therefore it is possible to have a parameter with multiple dimension, for example multiple length scales. As a result I think just len(self.parameters()) would be incorrect. But any feedback would be appreciated.

def bayesian_information_criterion(
        self, test_inputs: torch.Tensor, test_targets: torch.Tensor
    ) -> Union[int, float]:
        # count parameters
        k = 0
        hyperparemeter = list(self.parameters())
        # start from the second hyperparameter
        for param in hyperparemeter[1:]:
            e = 1
            param_size = list(param.data.size())
            if param_size:
                for dim in param_size:
                    e *= dim
            k += e

        # n is the sample size
        n = torch.tensor(test_inputs.size(dim=0))

        # calculate l, which is the log likelihood with optimal params
        mll = gpytorch.mlls.ExactMarginalLogLikelihood(self.likelihood, self)
        self.eval()
        output = self(test_inputs)
        l = mll(output, test_targets).sum()
        # calculate the bayseian information criterion
        # print(f"k={k}, n={n}, l={l}")
        return (k * torch.log(n).item() - 2 * l).item()

I am also using the ExactGP class if that is relevant.

cornellius-gp / gpytorch

Implementation of Bayesian Information Criterion (BIC) [Question] #2290