Closed bwpriest closed 1 year ago
This will need to occur before we incorporate the AnisotropicDistortion
class. It should be much easier now that PR #118 is merged.
@alecmdunton take a look at this and let me know if you're feeling up to it. It should serve as a good warm-up for the AnisotropicDistortion
, since it involves problems that you will also need to solve to get that hooked into the optimization chassis.
I will take care of this
@bwpriest When we have the user to create a kernel object, e.g., an RBF kernel, do we want the user to do something like
RBF(metric=IsotropicDistortion("l2", length_scale = Hyperparameter(2.0,(0.0,3.0)))
I need to know before I go ahead and start editing all of the files in tests
.
I have anisotropy implemented and am now trying to decide how we support optimization - the choice seems to be between creating a new hyperparameter class and updating the existing TensorHyperparameter
class.
@bwpriest When we have the user to create a kernel object, e.g., an RBF kernel, do we want the user to do something like
RBF(metric=IsotropicDistortion("l2", length_scale = Hyperparameter(2.0,(0.0,3.0)))
I need to know before I go ahead and start editing all of the files in
tests
.
Yes, we want length_scale
to be a kwarg to *Distortion.__init__()
. So, a full declaration of, say, an isotropic Matérn MuyGPS
object with homoscedastic noise should look like
muygps = MuyGPS(
Matern(
nu=Hyperparameter(0.38, (0.1, 1.5)),
metric=IsotropicDistortion(
"l2",
length_scale=Hyperparameter(2.0, (0.1, 3.0)),
),
),
eps=HomoscedasticNoise(1e-5)
)
I have the example code you shared above working for IsotropicDistortion
and AnisotropicDistortion
and am working on supporting optimization of anisotropic hyperparameters.
I have anisotropy implemented and am now trying to decide how we support optimization - the choice seems to be between creating a new hyperparameter class and updating the existing
TensorHyperparameter
class.
I believe that the individual anisotropic scaling parameters don't have anything to do with one another as far as optimization is concerned, so my inclination is to treat them as distinct Hyperparameter
objects. However, this would require the creation of an AnisotropicDistortion.hyperparameters
dict, similar to the KernelFn.hyperparameters
dict. It would also require canonical string names for each scaling parameter, so that they are store/optimized as "length_scale0"
, "length_scale1"
, et cetera.
We should think about how we will create AnisotropicDistortion
objects. The solution that I have suggested above would require us to do something like
AnisotropicDistortion(
"l2",
length_scale=(
Hyperparameter(2.0, (0.1, 3.0)),
...
Hyperparameter(1.5, (0.1, 3.0)),
),
)
We do need to specify the feature_count
, i.e. the number of the length scale parameters, and be able to set their initial guesses and bounds independently. I think that it will be difficult to support all of that flexibility with a TensorHyperparameter
or similar object.
Keeping everything separated in this way allows us to specify multiple length scales with the same initial guesses with
AnisotropicDistortion(
"l2",
length_scale=[Hyperparameter(2.0, (0.1, 3.0)] * feature_count,
)
Let me know if you think that there is a compelling reason to create a different class that supports this functionality.
I have the example code you shared above working for
IsotropicDistortion
andAnisotropicDistortion
and am working on supporting optimization of anisotropic hyperparameters.
It sounds like you might have commingled things, but if not it would make sense to add the code that addresses this issue in a different PR than the PR that introduces AnisotropicDistortion
. That way we can work out kinks in the API change that will inform how you deal with AnisotropicDistortion
.
It is also generally good to keep refactoring and feature add updates separate. Several small, focused PRs are much easier to review (and write!) than one large, multipurpose PR.
I opened a feature branch prior to your earlier PR which moved object creation into the user's hands - I can go through and remove the anisotropic aspects of the code for now and save them for later.
If you are OK with the dict
approach for anisotropy I am with you - it was the cumbersome nature of having to feed a separate object for each dimension that I was hesitant about, but we really don't want users to optimize that many anisotropic hyper parameters in the first place anyway.
I opened a feature branch prior to your earlier PR which moved object creation into the user's hands - I can go through and remove the anisotropic aspects of the code for now and save them for later.
Sounds good!
If you are OK with the
dict
approach for anisotropy I am with you - it was the cumbersome nature of having to feed a separate object for each dimension that I was hesitant about, but we really don't want users to optimize that many anisotropic hyper parameters in the first place anyway.
I think that specifying anisotropic length scale parameters will be a little cumbersome no matter what we do, because we will always need to support the user's ability to specify separate guesses and bounds for each dimension. We could make certain things like sharing initial guesses and bounds slightly easier with a new object, but I am not convinced that it is worth the maintenance overhead. We can always revisit that later if it becomes an issue.
This issue was dealt with in PR #122
This will make it possible to hold multiple dimension-wise
length_scale
parameters inside ofAnisotropicDistortion
without conflicting with thelength_scale
parameter that is currently inside of the kernel functions. It will be necessary to addget_optim_params
andget_opt_fn
methods to the distortion models and to hook them into those ofMatern
andRBF
. It will probably make the most sense to add extracting the distortion parameters to the baseKernelFn
and to have the higher-level functions extend those functions (e.g. Matern will need to addnu
).