Closed appletree999 closed 3 years ago
I am not sure what random_state
is doing in sklearn's GPR initializer, but to get consistent behaviour across experiments in GPy you could simply set the random seed in numpy prior to your experiments. i.e.
seed=42
np.random.seed(seed)
m = GPy.models.SparseGPRegression(x,y, num_inducing=10)
m.optimize(max_iters=2, messages=False)
print(m)
will always give you the same result for the same seed, whereas if you omit the first two lines you'll get different results every time you run that snippet.
You can indeed use the instantiated/trained kernel of a model. E.g. continuing the above example you can further do this:
m_new = GPy.models.SparseGPRegression(x,y,num_inducing=10, kernel=m.kern.copy())
to set a new model with the trained kernel from an old model.
Regarding optimizer, from the docstring you can see that:
:param optimizer: which optimizer to use (defaults to self.preferred optimizer), a range of optimisers can be found in :module:`~GPy.inference.optimization`, they include 'scg', 'lbfgs', 'tnc'.
and for an instantiated model you can see the preferred optimizer:
In []: m.preferred_optimizer
Out[]: 'lbfgsb'
Hi adamian, thanks for your answer. That clarified the "kernel" question and "m.preferred_optimizer". So, I took a look at "~GPy.inference.optimization" (here https://gpy.readthedocs.io/en/deploy/GPy.inference.optimization.html), it doesn't look like there are any optimizers?
A little more details about the "retraining" question. For example, if you first train the GPR for some rounds and then you analyze the data and decide to keep training. I'd like the process to pick up where it left on the previous training and continue the training (but not repeat the previous training sequence). The "random_state" in sklearn's GPR initializer is for that puspose.
Regarding optimization: The optimization of a model is actually inherited from the paramz package, you can see it here: https://github.com/sods/paramz/blob/master/paramz/model.py
Regarding continuing training: in GPy the state is internal in the instantiated class object. If you call m.optimize()
and then m.optimize()
again, the second optimization will continue from where the first one left off. Notice that in practice calling m.optimize(max_iters=100); m.optimize(max_iters=100)
might not be always exactly equivalent to m.optimize(max_iters=200)
depending on the optimizer you choose, e.g. an optimizer using linesearch might have to re-initialize some of its hyper-parameters the second time it's called.
Thank you adamian for your answers (the link to the GitHub is also helpful).
Glad to help, I'll close this issue for now if it is resolved.
This is probably not an issue, but I have some questions and don't know where to ask.
So, I just started trying GPy. Before I used sklearn, in which you can retrain the GPRegressor by passing in the previous random state. Is there a similar thing you can do in " GPy.models.GPRegression"? I didn't see you have that parameter. https://gpy.readthedocs.io/en/deploy/GPy.models.html?highlight=gpregression#GPy.models.gp_regression.GPRegression
Also, after training, if you want to use the kernel variable in a GPRegressor "gpr", can you just call it like "gpr.kernel" to get the trained kernel?
Also, what are the available optimizers? It doesn't list in the documentation. What is the value of "self.preferred optimizer"? https://gpy.readthedocs.io/en/deploy/GPy.core.html?highlight=Optimize#GPy.core.gp.GP.optimize
Thanks