experiencor / keras-yolo3

Training and Detecting Objects with YOLO3
MIT License
1.61k stars 861 forks source link

tensorflow.python.framework.errors_impl.NotFoundError: Resource localhost/_AnonymousVar330/N10tensorflow3VarE does not exist. #272

Open joseph-jomon opened 4 years ago

joseph-jomon commented 4 years ago

I tried to run train.py on a tensorflow2.0 env in conda but get this error. I really dont know what to make out of it , really appreaciate your help.

2020-05-09 11:23:38.277906: W tensorflow/core/common_runtime/bfcallocator.cc:424] ***** 2020-05-09 11:23:38.308158: W tensorflow/core/framework/op_kernel.cc:1622] OP_REQUIRES failed at conv_ops.cc:501 : Resource exhausted: OOM when allocating tensor with shape[4,32,416,416] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc Traceback (most recent call last): File "train.py", line 299, in main(args) File "train.py", line 276, in main max_queue_size = 8 File "/home/joseph/anaconda3/envs/ML/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper return func(*args, kwargs) File "/home/joseph/anaconda3/envs/ML/lib/python3.6/site-packages/keras/engine/training.py", line 1732, in fit_generator initial_epoch=initial_epoch) File "/home/joseph/anaconda3/envs/ML/lib/python3.6/site-packages/keras/engine/training_generator.py", line 220, in fit_generator reset_metrics=False) File "/home/joseph/anaconda3/envs/ML/lib/python3.6/site-packages/keras/engine/training.py", line 1514, in train_on_batch outputs = self.train_function(ins) File "/home/joseph/anaconda3/envs/ML/lib/python3.6/site-packages/tensorflow_core/python/keras/backend.py", line 3740, in call outputs = self._graph_fn(*converted_inputs) File "/home/joseph/anaconda3/envs/ML/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 1081, in call return self._call_impl(args, kwargs) File "/home/joseph/anaconda3/envs/ML/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 1121, in _call_impl return self._call_flat(args, self.captured_inputs, cancellation_manager) File "/home/joseph/anaconda3/envs/ML/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 1224, in _call_flat ctx, args, cancellation_manager=cancellation_manager) File "/home/joseph/anaconda3/envs/ML/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 511, in call ctx=ctx) File "/home/joseph/anaconda3/envs/ML/lib/python3.6/site-packages/tensorflow_core/python/eager/execute.py", line 67, in quick_execute six.raise_from(core._status_to_exception(e.code, message), None) File "", line 3, in raise_from

tensorflow.python.framework.errors_impl.NotFoundError: Resource localhost/_AnonymousVar330/N10tensorflow3VarE does not exist. [[node yolo_layer_2/AssignAddVariableOp (defined at /home/joseph/anaconda3/envs/ML/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py:1751) ]] [Op:__inference_keras_scratch_graph_77400]

Function call stack: keras_scratch_graph

jbutle55 commented 4 years ago

@joseph-jomon Did you figure out a solution to this problem?

whitewalker11 commented 4 years ago

same issue