Thus far, we have been constructing kernels of the form
$$
\sigma^2 (K + \tau^2 I_n)
$$
and optimizing $\sigma^2$ with a closed-form equation. However, with the leave-one-out-likelihood we can now effectively optimize $\sigma^2$ directly. This will mean casting it as a ScalarHyperparameter and hooking it into the optimization chassis like the other hyperparameters. It will also allow us to bring our kernel model into the following more standard formulation
$$
\sigma^2 K + \tau^2I_n.
$$
I believe that it will not be worthwhile to simultaneously maintain the old way of doing things alongside the new method, since the leave-one-out-likelihood is vastly superior to mean squared error as a loss function. However, we need to demonstrate that the new formulation is performant and sensitive to $\sigma^2$ before we can incorporate changes into the code. Assuming that all of this is successful, we may want to deprecate mse_fn and cross_entropy_fn in favor of loss functions like lool_fn that directly regulate the variance with coverage or similar.
Thus far, we have been constructing kernels of the form
$$ \sigma^2 (K + \tau^2 I_n) $$
and optimizing $\sigma^2$ with a closed-form equation. However, with the leave-one-out-likelihood we can now effectively optimize $\sigma^2$ directly. This will mean casting it as a
ScalarHyperparameter
and hooking it into the optimization chassis like the other hyperparameters. It will also allow us to bring our kernel model into the following more standard formulation$$ \sigma^2 K + \tau^2I_n. $$
I believe that it will not be worthwhile to simultaneously maintain the old way of doing things alongside the new method, since the leave-one-out-likelihood is vastly superior to mean squared error as a loss function. However, we need to demonstrate that the new formulation is performant and sensitive to $\sigma^2$ before we can incorporate changes into the code. Assuming that all of this is successful, we may want to deprecate
mse_fn
andcross_entropy_fn
in favor of loss functions likelool_fn
that directly regulate the variance with coverage or similar.