Closed danijar closed 5 years ago
@danijar
INFO:tensorflow: -------------------------------------------------- Epoch 1 phase train (phase step 0, global step 0). 2019-03-18 23:02:01.348377: W tensorflow/core/common_runtime/bfc_allocator.cc:211] Allocator (GPU_0_bfc) ran out of memory trying to allocate 783.09MiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available. step/score/loss/zs_entropy/zs_divergence = [0, -nan, 11821.6729, 35.320507, 2.84842563] 2019-03-18 23:02:02.203470: W tensorflow/core/common_runtime/bfc_allocator.cc:211] Allocator (GPU_0_bfc) ran out of memory trying to allocate 783.09MiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available. step/score/loss/zs_entropy/zs_divergence = [15, -nan, 11833.0537, 35.260334, 3.1794312] 2019-03-18 23:02:05.863356: W tensorflow/core/framework/op_kernel.cc:1261] Unknown: exceptions.RuntimeError: Cannot make contextcurrent on thread <_DummyThread(Dummy-5, started daemon 140200046999296)>: this context is already current on another thread <_DummyThread(Dummy-4, started daemon 140200038606592)>. Traceback (most recent call last): File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/ops/script_ops.py", line 206, in __call__ ret = func(*args) File "planet/control/in_graph_batch_env.py", line 95, in lambda a: self._batch_env.step(a)[:3], [action], File "planet/control/batch_env.py", line 86, in step for env, action in zip(self._envs, actions)] File "planet/control/wrappers.py", line 90, in step obs, reward, done, info = self._env.step(action) File "planet/control/wrappers.py", line 367, in step transition = self._env.step(action, *args, **kwargs) File "planet/control/wrappers.py", line 445, in step observ, reward, done, info = self._env.step(action) File "planet/control/wrappers.py", line 156, in step obs[self._key] = self._render_image() File "planet/control/wrappers.py", line 165, in _render_image image = self._env.render('rgb_array') File "planet/control/wrappers.py", line 261, in render *self._render_size, camera_id=self._camera_id) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/dm_control/mujoco/engine.py", line 171, in render physics=self, height=height, width=width, camera_id=camera_id) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/dm_control/mujoco/engine.py", line 574, in __init__ with self._physics.contexts.gl.make_current() as ctx: File "/home/lukas/miniconda3/envs/planet/lib/python2.7/contextlib.py", line 17, in __enter__ return self.gen.next() File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/dm_control/_render/base.py", line 116, in make_current _CURRENT_THREAD_FOR_CONTEXT[id(self)])) RuntimeError: Cannot make context current on thread <_DummyThread(Dummy-5, started daemon 140200046999296)>: this context is already current on another thread <_DummyThread(Dummy-4, started daemon 140200038606592)>. WARNING:tensorflow:Worker 006d3f0c-93c1-4a10-aee4-32ab0c8b125d run 00001: Exception: Traceback (most recent call last): File "planet/training/running.py", line 199, in __iter__ for value in self._process_fn(self._logdir, *args): File "/home/lukas/workspace/planet_src/planet/scripts/train.py", line 91, in process training.define_model, dataset, logdir, config): File "planet/training/utility.py", line 179, in train for score in trainer.iterate(config.max_steps): File "planet/training/trainer.py", line 201, in iterate summary, mean_score, global_step = sess.run(phase.op, phase.feed) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 929, in run run_metadata_ptr) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1152, in _run feed_dict_tensor, options, run_metadata) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run run_metadata) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call raise type(e)(node_def, op, message) UnknownError: exceptions.RuntimeError: Cannot make context current on thread <_DummyThread(Dummy-5, started daemon 140200046999296)>: this context is already current on another thread <_DummyThread(Dummy-4, started daemon 140200038606592)>. Traceback (most recent call last): File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/ops/script_ops.py", line 206, in __call__ ret = func(*args) File "planet/control/in_graph_batch_env.py", line 95, in lambda a: self._batch_env.step(a)[:3], [action], File "planet/control/batch_env.py", line 86, in step for env, action in zip(self._envs, actions)] File "planet/control/wrappers.py", line 90, in step obs, reward, done, info = self._env.step(action) File "planet/control/wrappers.py", line 367, in step transition = self._env.step(action, *args, **kwargs) File "planet/control/wrappers.py", line 445, in step observ, reward, done, info = self._env.step(action) File "planet/control/wrappers.py", line 156, in step obs[self._key] = self._render_image() File "planet/control/wrappers.py", line 165, in _render_image image = self._env.render('rgb_array') File "planet/control/wrappers.py", line 261, in render *self._render_size, camera_id=self._camera_id) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/dm_control/mujoco/engine.py", line 171, in render physics=self, height=height, width=width, camera_id=camera_id) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/dm_control/mujoco/engine.py", line 574, in __init__ with self._physics.contexts.gl.make_current() as ctx: File "/home/lukas/miniconda3/envs/planet/lib/python2.7/contextlib.py", line 17, in __enter__ return self.gen.next() File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/dm_control/_render/base.py", line 116, in make_current _CURRENT_THREAD_FOR_CONTEXT[id(self)])) RuntimeError: Cannot make context current on thread <_DummyThread(Dummy-5, started daemon 140200046999296)>: this context is already current on another thread <_DummyThread(Dummy-4, started daemon 140200038606592)>. [[node graph/collection/should_collect_cheetah_run/simulate-1/train-cheetah_run-cem-12/scan/while/simulate/environment/simulate/step (defined at planet/control/in_graph_batch_env.py:96) = PyFunc[Tin=[DT_FLOAT], Tout=[DT_UINT8, DT_FLOAT, DT_BOOL], token="pyfunc_7", _device="/job:localhost/replica:0/task:0/device:CPU:0"](graph/collection/should_collect_cheetah_run/simulate-1/train-cheetah_run-cem-12/scan/while/simulate/Identity_5/_847)]] [[{{node GroupCrossDeviceControlEdges_0/graph/collection/should_collect_cheetah_run/simulate-1/train-cheetah_run-cem-12/scan/while/simulate/environment/simulate/group_deps/_874}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_10532...group_deps", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](^_cloopgraph/collection/should_collect_cheetah_run/simulate-1/train-cheetah_run-cem-12/scan/while/Equal/_27)]] Caused by op u'graph/collection/should_collect_cheetah_run/simulate-1/train-cheetah_run-cem-12/scan/while/simulate/environment/simulate/step', defined at: File "/home/lukas/miniconda3/envs/planet/lib/python2.7/runpy.py", line 174, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/home/lukas/workspace/planet_src/planet/scripts/train.py", line 133, in tf.app.run(lambda _: main(args_), remaining) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 125, in run _sys.exit(main(argv)) File "/home/lukas/workspace/planet_src/planet/scripts/train.py", line 133, in tf.app.run(lambda _: main(args_), remaining) File "/home/lukas/workspace/planet_src/planet/scripts/train.py", line 106, in main for unused_score in run: File "planet/training/running.py", line 199, in __iter__ for value in self._process_fn(self._logdir, *args): File "/home/lukas/workspace/planet_src/planet/scripts/train.py", line 91, in process training.define_model, dataset, logdir, config): File "planet/training/utility.py", line 160, in train score, summary = model_fn(data, trainer, config) File "planet/training/define_model.py", line 133, in define_model name='should_collect_' + params.task.name) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func return func(*args, **kwargs) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2086, in cond orig_res_t, res_t = context_t.BuildCondBranch(true_fn) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 1930, in BuildCondBranch original_result = fn() File "planet/training/utility.py", line 254, in simulate_episodes 1, agent_config, name=name) File "planet/control/simulate.py", line 42, in simulate env_processes=env_processes) File "planet/control/simulate.py", line 78, in collect_rollouts initializer, parallel_iterations=1) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/ops/functional_ops.py", line 718, in scan maximum_iterations=n) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 3291, in while_loop return_same_structure) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 3004, in BuildLoop pred, body, original_loop_vars, loop_vars, shape_invariants) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2939, in _BuildLoop body_result = body(*packed_vars_for_body) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 3260, in body = lambda i, lv: (i + 1, orig_body(*lv)) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/ops/functional_ops.py", line 697, in compute a_out = fn(packed_a, packed_elems) File "planet/control/simulate.py", line 63, in simulate_fn reset=tf.equal(step, 0)) File "planet/control/simulate.py", line 219, in simulate_step step, score, length = _define_step() File "planet/control/simulate.py", line 150, in _define_step with tf.control_dependencies([batch_env.step(action)]): File "planet/control/in_graph_batch_env.py", line 96, in step [observ_dtype, tf.float32, tf.bool], name='step') File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/ops/script_ops.py", line 457, in py_func func=func, inp=inp, Tout=Tout, stateful=stateful, eager=False, name=name) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/ops/script_ops.py", line 281, in _internal_py_func input=inp, token=token, Tout=Tout, name=name) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/ops/gen_script_ops.py", line 129, in py_func "PyFunc", input=input, token=token, Tout=Tout, name=name) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func return func(*args, **kwargs) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 3274, in create_op op_def=op_def) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1770, in __init__ self._traceback = tf_stack.extract_stack() UnknownError (see above for traceback): exceptions.RuntimeError: Cannot make context current on thread <_DummyThread(Dummy-5, started daemon 140200046999296)>: this context is already current on another thread <_DummyThread(Dummy-4, started daemon 140200038606592)>. Traceback (most recent call last): File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/ops/script_ops.py", line 206, in __call__ ret = func(*args) File "planet/control/in_graph_batch_env.py", line 95, in lambda a: self._batch_env.step(a)[:3], [action], File "planet/control/batch_env.py", line 86, in step for env, action in zip(self._envs, actions)] File "planet/control/wrappers.py", line 90, in step obs, reward, done, info = self._env.step(action) File "planet/control/wrappers.py", line 367, in step transition = self._env.step(action, *args, **kwargs) File "planet/control/wrappers.py", line 445, in step observ, reward, done, info = self._env.step(action) File "planet/control/wrappers.py", line 156, in step obs[self._key] = self._render_image() File "planet/control/wrappers.py", line 165, in _render_image image = self._env.render('rgb_array') File "planet/control/wrappers.py", line 261, in render *self._render_size, camera_id=self._camera_id) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/dm_control/mujoco/engine.py", line 171, in render physics=self, height=height, width=width, camera_id=camera_id) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/dm_control/mujoco/engine.py", line 574, in __init__ with self._physics.contexts.gl.make_current() as ctx: File "/home/lukas/miniconda3/envs/planet/lib/python2.7/contextlib.py", line 17, in __enter__ return self.gen.next() File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/dm_control/_render/base.py", line 116, in make_current _CURRENT_THREAD_FOR_CONTEXT[id(self)])) RuntimeError: Cannot make context current on thread <_DummyThread(Dummy-5, started daemon 140200046999296)>: this context is already current on another thread <_DummyThread(Dummy-4, started daemon 140200038606592)>. [[node graph/collection/should_collect_cheetah_run/simulate-1/train-cheetah_run-cem-12/scan/while/simulate/environment/simulate/step (defined at planet/control/in_graph_batch_env.py:96) = PyFunc[Tin=[DT_FLOAT], Tout=[DT_UINT8, DT_FLOAT, DT_BOOL], token="pyfunc_7", _device="/job:localhost/replica:0/task:0/device:CPU:0"](graph/collection/should_collect_cheetah_run/simulate-1/train-cheetah_run-cem-12/scan/while/simulate/Identity_5/_847)]] [[{{node GroupCrossDeviceControlEdges_0/graph/collection/should_collect_cheetah_run/simulate-1/train-cheetah_run-cem-12/scan/while/simulate/environment/simulate/group_deps/_874}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_10532...group_deps", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](^_cloopgraph/collection/should_collect_cheetah_run/simulate-1/train-cheetah_run-cem-12/scan/while/Equal/_27)]] WARNING:tensorflow:Worker 006d3f0c-93c1-4a10-aee4-32ab0c8b125d run 00001: Failed. Traceback (most recent call last): File "/home/lukas/miniconda3/envs/planet/lib/python2.7/runpy.py", line 174, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/home/lukas/workspace/planet_src/planet/scripts/train.py", line 133, in tf.app.run(lambda _: main(args_), remaining) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 125, in run _sys.exit(main(argv)) File "/home/lukas/workspace/planet_src/planet/scripts/train.py", line 133, in tf.app.run(lambda _: main(args_), remaining) File "/home/lukas/workspace/planet_src/planet/scripts/train.py", line 106, in main for unused_score in run: File "planet/training/running.py", line 210, in __iter__ raise e tensorflow.python.framework.errors_impl.UnknownError: exceptions.RuntimeError: Cannot make context current on thread <_DummyThread(Dummy-5, started daemon 140200046999296)>: this context is already current on another thread <_DummyThread(Dummy-4, started daemon 140200038606592)>. Traceback (most recent call last): File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/ops/script_ops.py", line 206, in __call__ ret = func(*args) File "planet/control/in_graph_batch_env.py", line 95, in lambda a: self._batch_env.step(a)[:3], [action], File "planet/control/batch_env.py", line 86, in step for env, action in zip(self._envs, actions)] File "planet/control/wrappers.py", line 90, in step obs, reward, done, info = self._env.step(action) File "planet/control/wrappers.py", line 367, in step transition = self._env.step(action, *args, **kwargs) File "planet/control/wrappers.py", line 445, in step observ, reward, done, info = self._env.step(action) File "planet/control/wrappers.py", line 156, in step obs[self._key] = self._render_image() File "planet/control/wrappers.py", line 165, in _render_image image = self._env.render('rgb_array') File "planet/control/wrappers.py", line 261, in render *self._render_size, camera_id=self._camera_id) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/dm_control/mujoco/engine.py", line 171, in render physics=self, height=height, width=width, camera_id=camera_id) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/dm_control/mujoco/engine.py", line 574, in __init__ with self._physics.contexts.gl.make_current() as ctx: File "/home/lukas/miniconda3/envs/planet/lib/python2.7/contextlib.py", line 17, in __enter__ return self.gen.next() File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/dm_control/_render/base.py", line 116, in make_current _CURRENT_THREAD_FOR_CONTEXT[id(self)])) RuntimeError: Cannot make context current on thread <_DummyThread(Dummy-5, started daemon 140200046999296)>: this context is already current on another thread <_DummyThread(Dummy-4, started daemon 140200038606592)>. [[node graph/collection/should_collect_cheetah_run/simulate-1/train-cheetah_run-cem-12/scan/while/simulate/environment/simulate/step (defined at planet/control/in_graph_batch_env.py:96) = PyFunc[Tin=[DT_FLOAT], Tout=[DT_UINT8, DT_FLOAT, DT_BOOL], token="pyfunc_7", _device="/job:localhost/replica:0/task:0/device:CPU:0"](graph/collection/should_collect_cheetah_run/simulate-1/train-cheetah_run-cem-12/scan/while/simulate/Identity_5/_847)]] [[{{node GroupCrossDeviceControlEdges_0/graph/collection/should_collect_cheetah_run/simulate-1/train-cheetah_run-cem-12/scan/while/simulate/environment/simulate/group_deps/_874}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_10532...group_deps", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](^_cloopgraph/collection/should_collect_cheetah_run/simulate-1/train-cheetah_run-cem-12/scan/while/Equal/_27)]] Caused by op u'graph/collection/should_collect_cheetah_run/simulate-1/train-cheetah_run-cem-12/scan/while/simulate/environment/simulate/step', defined at: File "/home/lukas/miniconda3/envs/planet/lib/python2.7/runpy.py", line 174, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/home/lukas/workspace/planet_src/planet/scripts/train.py", line 133, in tf.app.run(lambda _: main(args_), remaining) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 125, in run _sys.exit(main(argv)) File "/home/lukas/workspace/planet_src/planet/scripts/train.py", line 133, in tf.app.run(lambda _: main(args_), remaining) File "/home/lukas/workspace/planet_src/planet/scripts/train.py", line 106, in main for unused_score in run: File "planet/training/running.py", line 199, in __iter__ for value in self._process_fn(self._logdir, *args): File "/home/lukas/workspace/planet_src/planet/scripts/train.py", line 91, in process training.define_model, dataset, logdir, config): File "planet/training/utility.py", line 160, in train score, summary = model_fn(data, trainer, config) File "planet/training/define_model.py", line 133, in define_model name='should_collect_' + params.task.name) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func return func(*args, **kwargs) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2086, in cond orig_res_t, res_t = context_t.BuildCondBranch(true_fn) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 1930, in BuildCondBranch original_result = fn() File "planet/training/utility.py", line 254, in simulate_episodes 1, agent_config, name=name) File "planet/control/simulate.py", line 42, in simulate env_processes=env_processes) File "planet/control/simulate.py", line 78, in collect_rollouts initializer, parallel_iterations=1) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/ops/functional_ops.py", line 718, in scan maximum_iterations=n) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 3291, in while_loop return_same_structure) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 3004, in BuildLoop pred, body, original_loop_vars, loop_vars, shape_invariants) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2939, in _BuildLoop body_result = body(*packed_vars_for_body) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 3260, in body = lambda i, lv: (i + 1, orig_body(*lv)) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/ops/functional_ops.py", line 697, in compute a_out = fn(packed_a, packed_elems) File "planet/control/simulate.py", line 63, in simulate_fn reset=tf.equal(step, 0)) File "planet/control/simulate.py", line 219, in simulate_step step, score, length = _define_step() File "planet/control/simulate.py", line 150, in _define_step with tf.control_dependencies([batch_env.step(action)]): File "planet/control/in_graph_batch_env.py", line 96, in step [observ_dtype, tf.float32, tf.bool], name='step') File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/ops/script_ops.py", line 457, in py_func func=func, inp=inp, Tout=Tout, stateful=stateful, eager=False, name=name) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/ops/script_ops.py", line 281, in _internal_py_func input=inp, token=token, Tout=Tout, name=name) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/ops/gen_script_ops.py", line 129, in py_func "PyFunc", input=input, token=token, Tout=Tout, name=name) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func return func(*args, **kwargs) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 3274, in create_op op_def=op_def) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1770, in __init__ self._traceback = tf_stack.extract_stack() UnknownError (see above for traceback): exceptions.RuntimeError: Cannot make context current on thread <_DummyThread(Dummy-5, started daemon 140200046999296)>: this context is already current on another thread <_DummyThread(Dummy-4, started daemon 140200038606592)>. Traceback (most recent call last): File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/tensorflow/python/ops/script_ops.py", line 206, in __call__ ret = func(*args) File "planet/control/in_graph_batch_env.py", line 95, in lambda a: self._batch_env.step(a)[:3], [action], File "planet/control/batch_env.py", line 86, in step for env, action in zip(self._envs, actions)] File "planet/control/wrappers.py", line 90, in step obs, reward, done, info = self._env.step(action) File "planet/control/wrappers.py", line 367, in step transition = self._env.step(action, *args, **kwargs) File "planet/control/wrappers.py", line 445, in step observ, reward, done, info = self._env.step(action) File "planet/control/wrappers.py", line 156, in step obs[self._key] = self._render_image() File "planet/control/wrappers.py", line 165, in _render_image image = self._env.render('rgb_array') File "planet/control/wrappers.py", line 261, in render *self._render_size, camera_id=self._camera_id) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/dm_control/mujoco/engine.py", line 171, in render physics=self, height=height, width=width, camera_id=camera_id) File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/dm_control/mujoco/engine.py", line 574, in __init__ with self._physics.contexts.gl.make_current() as ctx: File "/home/lukas/miniconda3/envs/planet/lib/python2.7/contextlib.py", line 17, in __enter__ return self.gen.next() File "/home/lukas/miniconda3/envs/planet/lib/python2.7/site-packages/dm_control/_render/base.py", line 116, in make_current _CURRENT_THREAD_FOR_CONTEXT[id(self)])) RuntimeError: Cannot make context current on thread <_DummyThread(Dummy-5, started daemon 140200046999296)>: this context is already current on another thread <_DummyThread(Dummy-4, started daemon 140200038606592)>. [[node graph/collection/should_collect_cheetah_run/simulate-1/train-cheetah_run-cem-12/scan/while/simulate/environment/simulate/step (defined at planet/control/in_graph_batch_env.py:96) = PyFunc[Tin=[DT_FLOAT], Tout=[DT_UINT8, DT_FLOAT, DT_BOOL], token="pyfunc_7", _device="/job:localhost/replica:0/task:0/device:CPU:0"](graph/collection/should_collect_cheetah_run/simulate-1/train-cheetah_run-cem-12/scan/while/simulate/Identity_5/_847)]] [[{{node GroupCrossDeviceControlEdges_0/graph/collection/should_collect_cheetah_run/simulate-1/train-cheetah_run-cem-12/scan/while/simulate/environment/simulate/group_deps/_874}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_10532...group_deps", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](^_cloopgraph/collection/should_collect_cheetah_run/simulate-1/train-cheetah_run-cem-12/scan/while/Equal/_27)]]
Please, @danijar, could you look again into the source code and #6 issue. Note my last comment - there might be something wrong with "reset" and "close" message order. Super important - I tried to call render directly on ExternalProcess, which started to yield similar to #6 issue errors. Maybe that's gonna be of some help.
Here's the source code:
from dm_control import suite
import numpy as np
from dm_control import viewer
from threading import Thread
import cv2
from planet.control.wrappers import ExternalProcess, ActionRepeat, LimitDuration, PixelObservations, ConvertTo32Bit
from planet.control.wrappers import DeepMindWrapper
def rewards(env):
# Step through an episode and print out reward, discount and observation.
action_spec = env.action_spec()
time_step = env.reset()
while True:
action = np.random.uniform(action_spec.minimum,
action_spec.maximum,
size=action_spec.shape)
time_step = env.step(action)
img = env.physics.render()
cv2.imshow("img", img)
cv2.waitKey(0)
# Load one task:
# env = suite.load(domain_name="cartpole", task_name="swingup")
def env_ctor():
env = DeepMindWrapper(suite.load("cartpole", "swingup"), (64, 64))
env = ActionRepeat(env, 2)
# env = LimitDuration(env, 1000)
env = PixelObservations(env, (64, 64), np.uint8, 'image')
env = ConvertTo32Bit(env)
return env
env = ExternalProcess(env_ctor)
# Iterate over a task set:
# for domain_name, task_name in suite.BENCHMARKING:
# env = suite.load(domain_name, task_name)
action_spec = env.action_space
time_step = env.reset()
while True:
action = action_spec.sample()
time_step = env.step(action)
img = env.call("render")()
cv2.imshow("img", img)
cv2.waitKey(1)
# viewer.launch(env)
@astronautas Are your results for different dm_control rendering options consistent with https://github.com/google-research/planet/issues/6#issuecomment-474493971? The error message you attached seems to be specific to GLFW.
In your example script, you're trying to call render on the physics object directly. I doubt this will work since the ExternalProcess
wrapper would try to pickle the physics object to send it over to the main process. Instead, can you look at the observations returned by the environment? This way, the images will be rendered by the PixelObservations
wrapper within the same process as the environment lives.
It's not impossible that the problem lies in the reset message. However, it works well for many other people including myself. I'm not sure how a race condition could occur since the env worker just pulls one message after another from the pipe. That being said, please feel free to look at the external process wrapper and see if you find a problem -- the code is quite simple. I just did and I didn't see a problem with it.
Hi @astronautas, @JamesLuoau, I've updated the dependency section of the readme to list precise versions and compatible rendering options. Could you please verify if you still have a problem running the code under the specified setup?
@danijar I'll reinstall all the dependencies in a new conda environment and verify whether it succeeds. I have a feeling that this could work on Ubuntu 18.04. Multiple users that got this working under 18.04. I have 16.04, so there might be some issues with that.
EDIT: still cannot get it working under 16.04 with all the correct dependency version. One thing - I was not able to install dm_control via setup.py (no PyPi package of that). I installed it from their github repository (as instructed in README.md). Could you also specify dm_control version?
@danijar Solved this one, thanks again.
Thanks awesome! What did you do to make it work?
It's the same solution as to "Connection reset by peer" problem. I upgraded Ubuntu from 16.04 to 18.04, as well as installed latest Nvidia drivers (418). Using the EGL rendering, it does run and quite well!
So, I'm not sure what exactly helped but this could be a heuristic for other people to reproduce the code :)
Thanks for letting others know and nice that it's working now!
@astronautas I'm starting a new thread for this to keep things separated. Thank you for looking more into this! To recap, the error message you reported comes from dm_control's renderer:
Here are a few suggestions of what to try:
DeepMindWrapper
and theExternalProcess
wrapper, so see if these two work for you outside of the PlaNet code.scripts/tasks.py
which worked for user @2877992943 on Mac; see https://github.com/google-research/planet/issues/2#issuecomment-468171372.Note that TensorFlow has only one process but that process has a thread pool. When data gets collected, we call
tf.py_func()
to step the environment. This will be called by any of the threads, depending on where TensorFlow decided to schedule the operation.