I am training the 'raccoon' model on an AWS p2.large (Tesla K80) running "Deep Learning Base AMI (Amazon Linux) Version 7.0 (ami-6bc6ac14)"
I get the exceptions below for this command python train.py -c config.json
Has anyone seen this?
2018-06-10 22:42:21.448910: W tensorflow/core/common_runtime/bfc_allocator.cc:279] ****
Traceback (most recent call last):
File "/home/ec2-user/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1322, in _do_call
return fn(*args)
File "/home/ec2-user/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1307, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/home/ec2-user/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1409, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InternalError: Dst tensor is not initialized.
[[Node: Tile_3/_4497 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_13327_Tile_3", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "train.py", line 280, in main(args)
File "train.py", line 257, in main
max_queue_size = 8
File "/home/ec2-user/miniconda3/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, kwargs)
File "/home/ec2-user/miniconda3/lib/python3.6/site-packages/keras/engine/training.py", line 2230, in fit_generator
class_weight=class_weight)
File "/home/ec2-user/miniconda3/lib/python3.6/site-packages/keras/engine/training.py", line 1883, in train_on_batch
outputs = self.train_function(ins)
File "/home/ec2-user/miniconda3/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2482, in callself.session_kwargs)
File "/home/ec2-user/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 900, in run
run_metadata_ptr)
File "/home/ec2-user/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1135, in _run
feed_dict_tensor, options, run_metadata)
File "/home/ec2-user/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1316, in _do_run
run_metadata)
File "/home/ec2-user/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: Dst tensor is not initialized.
[[Node: Tile_3/_4497 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_13327_Tile_3", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
I am training the 'raccoon' model on an AWS p2.large (Tesla K80) running "Deep Learning Base AMI (Amazon Linux) Version 7.0 (ami-6bc6ac14)"
I get the exceptions below for this command
python train.py -c config.json
Has anyone seen this?
2018-06-10 22:42:21.448910: W tensorflow/core/common_runtime/bfc_allocator.cc:279] **** Traceback (most recent call last): File "/home/ec2-user/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1322, in _do_call return fn(*args) File "/home/ec2-user/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1307, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "/home/ec2-user/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1409, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InternalError: Dst tensor is not initialized. [[Node: Tile_3/_4497 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_13327_Tile_3", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "train.py", line 280, in
main(args)
File "train.py", line 257, in main
max_queue_size = 8
File "/home/ec2-user/miniconda3/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, kwargs)
File "/home/ec2-user/miniconda3/lib/python3.6/site-packages/keras/engine/training.py", line 2230, in fit_generator
class_weight=class_weight)
File "/home/ec2-user/miniconda3/lib/python3.6/site-packages/keras/engine/training.py", line 1883, in train_on_batch
outputs = self.train_function(ins)
File "/home/ec2-user/miniconda3/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2482, in call
self.session_kwargs)
File "/home/ec2-user/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 900, in run
run_metadata_ptr)
File "/home/ec2-user/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1135, in _run
feed_dict_tensor, options, run_metadata)
File "/home/ec2-user/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1316, in _do_run
run_metadata)
File "/home/ec2-user/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: Dst tensor is not initialized.
[[Node: Tile_3/_4497 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_13327_Tile_3", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]