Closed ryanjulian closed 6 years ago
This was probably introduced by the TensorBoard logger
This also means we can't launch tensorboard in parallel with training
(garage) rjulian@nyquist:~/.../garage/sandbox/embed2learn$ tensorboard --logdir ../../data/local/trpo-point-embed/
2018-06-14 16:50:11.081257: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2018-06-14 16:50:12.551595: E tensorflow/core/common_runtime/direct_session.cc:154] Internal: failed initializing StreamExecutor for CUDA device ordinal 0: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_OUT_OF_MEMORY; total memory reported: 12788498432
Traceback (most recent call last):
File "/home/rjulian/miniconda2/envs/garage/bin/tensorboard", line 11, in <module>
sys.exit(run_main())
File "/home/rjulian/miniconda2/envs/garage/lib/python3.5/site-packages/tensorboard/main.py", line 36, in run_main
tf.app.run(main)
File "/home/rjulian/miniconda2/envs/garage/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 126, in run
_sys.exit(main(argv))
File "/home/rjulian/miniconda2/envs/garage/lib/python3.5/site-packages/tensorboard/main.py", line 45, in main
default.get_assets_zip_provider())
File "/home/rjulian/miniconda2/envs/garage/lib/python3.5/site-packages/tensorboard/program.py", line 166, in main
tb = create_tb_app(plugins, assets_zip_provider)
File "/home/rjulian/miniconda2/envs/garage/lib/python3.5/site-packages/tensorboard/program.py", line 201, in create_tb_app
flags=FLAGS)
File "/home/rjulian/miniconda2/envs/garage/lib/python3.5/site-packages/tensorboard/backend/application.py", line 126, in standard_tensorboard_wsgi
plugin_instances = [constructor(context) for constructor in plugins]
File "/home/rjulian/miniconda2/envs/garage/lib/python3.5/site-packages/tensorboard/backend/application.py", line 126, in <listcomp>
plugin_instances = [constructor(context) for constructor in plugins]
File "/home/rjulian/miniconda2/envs/garage/lib/python3.5/site-packages/tensorboard/plugins/beholder/beholder_plugin.py", line 47, in __init__
self.most_recent_frame = im_util.get_image_relative_to_script('no-data.png')
File "/home/rjulian/miniconda2/envs/garage/lib/python3.5/site-packages/tensorboard/plugins/beholder/im_util.py", line 254, in get_image_relative_to_script
return read_image(filename)
File "/home/rjulian/miniconda2/envs/garage/lib/python3.5/site-packages/tensorboard/plugins/beholder/im_util.py", line 242, in read_image
return np.array(decode_png(image_file.read()))
File "/home/rjulian/miniconda2/envs/garage/lib/python3.5/site-packages/tensorboard/plugins/beholder/im_util.py", line 159, in __call__
self._lazily_initialize()
File "/home/rjulian/miniconda2/envs/garage/lib/python3.5/site-packages/tensorboard/plugins/beholder/im_util.py", line 137, in _lazily_initialize
self._session = tf.Session(graph=graph, config=config)
File "/home/rjulian/miniconda2/envs/garage/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1560, in __init__
super(Session, self).__init__(target, graph, config=config)
File "/home/rjulian/miniconda2/envs/garage/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 633, in __init__
self._session = tf_session.TF_NewSession(self._graph._c_graph, opts)
tensorflow.python.framework.errors_impl.InternalError: Failed to create session.
Confirmed that this error only occurs with tensorflow_gpu
Fixed in #88