LLNL / MuyGPyS

A fast, pure python implementation of the MuyGPs Gaussian process realization and training algorithm.
Other
23 stars 11 forks source link

Need to move `length_scale` parameter inside of the distortion model and out of the kernel functions #120

Closed bwpriest closed 1 year ago

bwpriest commented 1 year ago

This will make it possible to hold multiple dimension-wise length_scale parameters inside of AnisotropicDistortion without conflicting with the length_scale parameter that is currently inside of the kernel functions. It will be necessary to add get_optim_params and get_opt_fn methods to the distortion models and to hook them into those of Matern and RBF. It will probably make the most sense to add extracting the distortion parameters to the base KernelFn and to have the higher-level functions extend those functions (e.g. Matern will need to add nu).

bwpriest commented 1 year ago

This will need to occur before we incorporate the AnisotropicDistortion class. It should be much easier now that PR #118 is merged.

bwpriest commented 1 year ago

@alecmdunton take a look at this and let me know if you're feeling up to it. It should serve as a good warm-up for the AnisotropicDistortion, since it involves problems that you will also need to solve to get that hooked into the optimization chassis.

alecmdunton commented 1 year ago

I will take care of this

alecmdunton commented 1 year ago

@bwpriest When we have the user to create a kernel object, e.g., an RBF kernel, do we want the user to do something like

RBF(metric=IsotropicDistortion("l2", length_scale = Hyperparameter(2.0,(0.0,3.0)))

I need to know before I go ahead and start editing all of the files in tests.

alecmdunton commented 1 year ago

I have anisotropy implemented and am now trying to decide how we support optimization - the choice seems to be between creating a new hyperparameter class and updating the existing TensorHyperparameter class.

bwpriest commented 1 year ago

@bwpriest When we have the user to create a kernel object, e.g., an RBF kernel, do we want the user to do something like

RBF(metric=IsotropicDistortion("l2", length_scale = Hyperparameter(2.0,(0.0,3.0)))

I need to know before I go ahead and start editing all of the files in tests.

Yes, we want length_scale to be a kwarg to *Distortion.__init__(). So, a full declaration of, say, an isotropic Matérn MuyGPS object with homoscedastic noise should look like

muygps = MuyGPS(
    Matern(
        nu=Hyperparameter(0.38, (0.1, 1.5)),
        metric=IsotropicDistortion(
            "l2",
            length_scale=Hyperparameter(2.0, (0.1, 3.0)),
        ),
    ),
    eps=HomoscedasticNoise(1e-5)
)
alecmdunton commented 1 year ago

I have the example code you shared above working for IsotropicDistortion and AnisotropicDistortion and am working on supporting optimization of anisotropic hyperparameters.

bwpriest commented 1 year ago

I have anisotropy implemented and am now trying to decide how we support optimization - the choice seems to be between creating a new hyperparameter class and updating the existing TensorHyperparameter class.

I believe that the individual anisotropic scaling parameters don't have anything to do with one another as far as optimization is concerned, so my inclination is to treat them as distinct Hyperparameter objects. However, this would require the creation of an AnisotropicDistortion.hyperparameters dict, similar to the KernelFn.hyperparameters dict. It would also require canonical string names for each scaling parameter, so that they are store/optimized as "length_scale0", "length_scale1", et cetera.

We should think about how we will create AnisotropicDistortion objects. The solution that I have suggested above would require us to do something like

AnisotropicDistortion(
    "l2",
    length_scale=(
        Hyperparameter(2.0, (0.1, 3.0)),
        ...
        Hyperparameter(1.5, (0.1, 3.0)),
    ),
)

We do need to specify the feature_count, i.e. the number of the length scale parameters, and be able to set their initial guesses and bounds independently. I think that it will be difficult to support all of that flexibility with a TensorHyperparameter or similar object. Keeping everything separated in this way allows us to specify multiple length scales with the same initial guesses with

AnisotropicDistortion(
    "l2",
    length_scale=[Hyperparameter(2.0, (0.1, 3.0)] * feature_count,
)

Let me know if you think that there is a compelling reason to create a different class that supports this functionality.

bwpriest commented 1 year ago

I have the example code you shared above working for IsotropicDistortion and AnisotropicDistortion and am working on supporting optimization of anisotropic hyperparameters.

It sounds like you might have commingled things, but if not it would make sense to add the code that addresses this issue in a different PR than the PR that introduces AnisotropicDistortion. That way we can work out kinks in the API change that will inform how you deal with AnisotropicDistortion.

It is also generally good to keep refactoring and feature add updates separate. Several small, focused PRs are much easier to review (and write!) than one large, multipurpose PR.

alecmdunton commented 1 year ago

I opened a feature branch prior to your earlier PR which moved object creation into the user's hands - I can go through and remove the anisotropic aspects of the code for now and save them for later.

If you are OK with the dict approach for anisotropy I am with you - it was the cumbersome nature of having to feed a separate object for each dimension that I was hesitant about, but we really don't want users to optimize that many anisotropic hyper parameters in the first place anyway.

bwpriest commented 1 year ago

I opened a feature branch prior to your earlier PR which moved object creation into the user's hands - I can go through and remove the anisotropic aspects of the code for now and save them for later.

Sounds good!

If you are OK with the dict approach for anisotropy I am with you - it was the cumbersome nature of having to feed a separate object for each dimension that I was hesitant about, but we really don't want users to optimize that many anisotropic hyper parameters in the first place anyway.

I think that specifying anisotropic length scale parameters will be a little cumbersome no matter what we do, because we will always need to support the user's ability to specify separate guesses and bounds for each dimension. We could make certain things like sharing initial guesses and bounds slightly easier with a new object, but I am not convinced that it is worth the maintenance overhead. We can always revisit that later if it becomes an issue.

alecmdunton commented 1 year ago

This issue was dealt with in PR #122