Hi, I'm using rl coach through AWS Sagemaker, and I'm running in an issue that I struggle to understand.
I'm performing RL using AWS Sagemaker for the learning, and AWS Robomaker for the environment, like in DeepRacer which uses rl coach as well. In fact, the code only little differs with the DeepRacer code on the learning side. But the environment is completely different though.
What happens:
The graph manager initialization succeeds
A first checkpoint is generated (and uploaded to S3)
The agent loads the first checkpoint
The agent performs N episodes with the first policy
The graph manager fetches the N episodes
The graph manager performs 1 training step and create a second checkpoint (uploaded to S3)
The agent fails to restore the model with the second checkpoint.
The agent raises an exception with the message : Failed to restore agent's checkpoint: 'main_level/agent/main/online/global_step'
File "/someverylongpath/rl_coach/architectures/tensorflow_components/savers.py", line 93, in <dictcomp>
for ph, v in zip(self._variable_placeholders, self._variables)
KeyError: 'main_level/agent/main/online/global_step'
Which I think testifies that in the function from_arrays, variables and self._variables do not contain the same variables...
So, there's a few things I don't understand about this problem, and I'm not used to rl coach so I think your point of view would be valuable.
How can variables and self._variables be different ?
Why does it fails only the second time ? (Does the graph manager change the computational graph ?)
Hi, I'm using rl coach through AWS Sagemaker, and I'm running in an issue that I struggle to understand.
I'm performing RL using AWS Sagemaker for the learning, and AWS Robomaker for the environment, like in DeepRacer which uses rl coach as well. In fact, the code only little differs with the DeepRacer code on the learning side. But the environment is completely different though.
What happens:
The agent raises an exception with the message :
Failed to restore agent's checkpoint: 'main_level/agent/main/online/global_step'
The traceback points to a bug happening in this rl coach module:
Which I think testifies that in the function
from_arrays
,variables
andself._variables
do not contain the same variables...So, there's a few things I don't understand about this problem, and I'm not used to rl coach so I think your point of view would be valuable.
variables
andself._variables
be different ?A few more info:
rl-coach-slim 1.0.0
andtensorflow 1.11.0