IntelLabs / coach

Reinforcement Learning Coach by Intel AI Lab enables easy experimentation with state of the art Reinforcement Learning algorithms
https://intellabs.github.io/coach/
Apache License 2.0
2.32k stars 460 forks source link

tensorflow.python.framework.errors_impl.InvalidArgumentError: Task 8 was not defined in job "worker" #391

Closed zhichao-li closed 4 years ago

zhichao-li commented 4 years ago

I was facing the following issue while running "coach -r -p Atari_A3C -lvl breakout -n 8" with coach 1.0.0 and intel-Tensorflow: 1.13.1 or 1.9.0 on Ubuntu 18.04. Any suggestion?

Traceback (most recent call last):
  File "/home/lizhichao/anaconda3/envs/coachpy36/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/home/lizhichao/anaconda3/envs/coachpy36/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/home/lizhichao/anaconda3/envs/coachpy36/lib/python3.6/site-packages/rl_coach/coach.py", line 79, in start_graph
    graph_manager.create_graph(task_parameters)
  File "/home/lizhichao/anaconda3/envs/coachpy36/lib/python3.6/site-packages/rl_coach/graph_managers/graph_manager.py", line 145, in create_graph
    self.create_worker_or_parameters_server(task_parameters=task_parameters)
  File "/home/lizhichao/anaconda3/envs/coachpy36/lib/python3.6/site-packages/rl_coach/graph_managers/graph_manager.py", line 207, in create_worker_or_parameters_server
    return GraphManager._create_worker_or_parameters_server_tf(task_parameters)
  File "/home/lizhichao/anaconda3/envs/coachpy36/lib/python3.6/site-packages/rl_coach/graph_managers/graph_manager.py", line 199, in _create_worker_or_parameters_server_tf
    config=config)
  File "/home/lizhichao/anaconda3/envs/coachpy36/lib/python3.6/site-packages/rl_coach/architectures/tensorflow_components/distributed_tf_utils.py", line 64, in create_worker_server_and_device
    server = tf.train.Server(cluster_spec, job_name="worker", task_index=task_index, config=config)
  File "/home/lizhichao/anaconda3/envs/coachpy36/lib/python3.6/site-packages/tensorflow/python/training/server_lib.py", line 147, in __init__
    self._server_def.SerializeToString(), status)
  File "/home/lizhichao/anaconda3/envs/coachpy36/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 519, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Task 8 was not defined in job "worker"