nrontsis / PILCO

Bayesian Reinforcement Learning in Tensorflow
MIT License
311 stars 84 forks source link

Bugs in model update? #63

Open fuku10 opened 1 year ago

fuku10 commented 1 year ago

Hello, I found a strange behavior in model optimization of mgpr.py.

(1) Is best_params["k_lengthscales"] = model.kernel.lengthscales best_params["lengthscales"] = model.kernel.lengthscales ?

(2) It seems that best_params is updated when optimizer.minimize(model.training_loss, model.trainable_variables) is executed. It means that the values of best_params always changes regardless of whether if loss < best_loss is True or False.

My environment is; Python 3.7.12 tensorflow 2.9.1 gpflow 2.5.2 gym 0.18.0

Thanks,

fuku10 commented 1 year ago

It seems that (2) was solved by using copy.deepcopy().

I added import copy and changed

            best_params = {
                "lengthscales" : model.kernel.lengthscales,
                "k_variance" : model.kernel.variance,
                "l_variance" : model.likelihood.variance}

to

           best_params = {
                "lengthscales" : copy.deepcopy(model.kernel.lengthscales),
                "k_variance" : copy.deepcopy(model.kernel.variance),
                "l_variance" : copy.deepcopy(model.likelihood.variance)}

Moreover, I changed

                    best_params["k_lengthscales"] = model.kernel.lengthscales
                    best_params["k_variance"] = model.kernel.variance
                    best_params["l_variance"] = model.likelihood.variance

to

                    best_params["lengthscales"] = copy.deepcopy(model.kernel.lengthscales) 
                    best_params["k_variance"] = copy.deepcopy(model.kernel.variance)
                    best_params["l_variance"] = copy.deepcopy(model.likelihood.variance)

(Note that "k_lengthscales" is also changed to "lengthscales".)