Closed araffin closed 5 years ago
I found the bug, the duplicated keys comes from find_trainable_variables
that is not correctly implemented:
find_trainable_variables('bla')
returns the same as find_trainable_variables()
even though the scope bla
is not defined.
A fix for that function is easy:
def find_trainable_variables(key):
"""
Returns the trainable variables within a given scope
:param key: (str) The variable scope
:return: ([TensorFlow Tensor]) the trainable variables
"""
return tf.trainable_variables(scope=key)
or simply use tf_util.get_trainable_vars(scope)
which is already in the code...
however, it may breaks previous saved models.
The good news, is that it affects only DDPG with normalization. For the others algorithms, it was just saving duplicated params.
Note: this was silently fixed in OpenAI repo in https://github.com/openai/baselines/commit/8c2aea2addc9f3ba36d4a0c937e6a2d09830afc7
Describe the bug DDPG models saved with stable-baselines<=v2.5.1 (and normalisation activated) cannot be loaded with the master version. This bug must be fixed before the next version.
Code example
model = DDPG('MlpPolicy', 'Pendulum-v0', normalize_observations=True) model.save("DDPG_pendulum_v2.5.1")
Traceback:
Looking at what is happening: there seems to be duplicated keys in
self.params
of DDPG, those duplicates are removed because we are using dict now.I'm currently investigating how to fix this bug. Any help is appreciated.