experiencor / keras-yolo3

Training and Detecting Objects with YOLO3
MIT License
1.6k stars 861 forks source link

tensorflow.python.framework.errors_impl.InternalError: Dst tensor is not initialized. #96

Open dasfaha opened 6 years ago

dasfaha commented 6 years ago

I am training the 'raccoon' model on an AWS p2.large (Tesla K80) running "Deep Learning Base AMI (Amazon Linux) Version 7.0 (ami-6bc6ac14)"

I get the exceptions below for this command python train.py -c config.json

Has anyone seen this?

2018-06-10 22:42:21.448910: W tensorflow/core/common_runtime/bfc_allocator.cc:279] **** Traceback (most recent call last): File "/home/ec2-user/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1322, in _do_call return fn(*args) File "/home/ec2-user/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1307, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "/home/ec2-user/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1409, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InternalError: Dst tensor is not initialized. [[Node: Tile_3/_4497 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_13327_Tile_3", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "train.py", line 280, in main(args) File "train.py", line 257, in main max_queue_size = 8 File "/home/ec2-user/miniconda3/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper return func(*args, kwargs) File "/home/ec2-user/miniconda3/lib/python3.6/site-packages/keras/engine/training.py", line 2230, in fit_generator class_weight=class_weight) File "/home/ec2-user/miniconda3/lib/python3.6/site-packages/keras/engine/training.py", line 1883, in train_on_batch outputs = self.train_function(ins) File "/home/ec2-user/miniconda3/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2482, in call self.session_kwargs) File "/home/ec2-user/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 900, in run run_metadata_ptr) File "/home/ec2-user/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1135, in _run feed_dict_tensor, options, run_metadata) File "/home/ec2-user/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1316, in _do_run run_metadata) File "/home/ec2-user/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InternalError: Dst tensor is not initialized. [[Node: Tile_3/_4497 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_13327_Tile_3", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]

CHOcho-quan commented 5 years ago

It's an OOM error , please set your batch_size to 1