pierluigiferrari / ssd_keras

A Keras port of Single Shot MultiBox Detector
Apache License 2.0
1.86k stars 938 forks source link

ValueError: Layer model expects 1 input(s), but it received 2 input tensors. Inputs received: [<tf.Tensor 'IteratorGetNext:0' shape=(None, None, None, None) dtype=uint8>, <tf.Tensor 'IteratorGetNext:1' shape=(None, None, None) dtype=float32>] #380

Open suprateembanerjee opened 3 years ago

suprateembanerjee commented 3 years ago

Tensorflow V2 (latest) Keras (latest) ssd300_training.ipynb

I have managed to convert most of the V1 code to V2 and successfully run it. I have made changes to all the python files as necessary too. However, this issue occurs on the line

history = model.fit_generator(generator=train_generator, steps_per_epoch=steps_per_epoch, epochs=final_epoch, callbacks=callbacks, validation_data=val_generator, validation_steps=ceil(val_dataset_size/batch_size), initial_epoch=initial_epoch)

Entire error:


Epoch 1/120

Epoch 00001: LearningRateScheduler reducing learning rate to 0.001.

ValueError Traceback (most recent call last)

in 4 steps_per_epoch = 1000 5 ----> 6 history = model.fit_generator(generator=train_generator, 7 steps_per_epoch=steps_per_epoch, 8 epochs=final_epoch, c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\keras\engine\training.py in fit_generator(self, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, validation_freq, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch) 1844 'will be removed in a future version. ' 1845 'Please use `Model.fit`, which supports generators.') -> 1846 return self.fit( 1847 generator, 1848 steps_per_epoch=steps_per_epoch, c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\keras\engine\training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_batch_size, validation_freq, max_queue_size, workers, use_multiprocessing) 1097 _r=1): 1098 callbacks.on_train_batch_begin(step) -> 1099 tmp_logs = self.train_function(iterator) 1100 if data_handler.should_sync: 1101 context.async_wait() c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\eager\def_function.py in __call__(self, *args, **kwds) 782 tracing_count = self.experimental_get_tracing_count() 783 with trace.Trace(self._name) as tm: --> 784 result = self._call(*args, **kwds) 785 compiler = "xla" if self._experimental_compile else "nonXla" 786 new_tracing_count = self.experimental_get_tracing_count() c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\eager\def_function.py in _call(self, *args, **kwds) 825 # This is the first call of __call__, so we have to initialize. 826 initializers = [] --> 827 self._initialize(args, kwds, add_initializers_to=initializers) 828 finally: 829 # At this point we know that the initialization is complete (or less c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\eager\def_function.py in _initialize(self, args, kwds, add_initializers_to) 679 self._graph_deleter = FunctionDeleter(self._lifted_initializer_graph) 680 self._concrete_stateful_fn = ( --> 681 self._stateful_fn._get_concrete_function_internal_garbage_collected( # pylint: disable=protected-access 682 *args, **kwds)) 683 c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\eager\function.py in _get_concrete_function_internal_garbage_collected(self, *args, **kwargs) 2995 args, kwargs = None, None 2996 with self._lock: -> 2997 graph_function, _ = self._maybe_define_function(args, kwargs) 2998 return graph_function 2999 c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\eager\function.py in _maybe_define_function(self, args, kwargs) 3387 3388 self._function_cache.missed.add(call_context_key) -> 3389 graph_function = self._create_graph_function(args, kwargs) 3390 self._function_cache.primary[cache_key] = graph_function 3391 c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\eager\function.py in _create_graph_function(self, args, kwargs, override_flat_arg_shapes) 3222 arg_names = base_arg_names + missing_arg_names 3223 graph_function = ConcreteFunction( -> 3224 func_graph_module.func_graph_from_py_func( 3225 self._name, 3226 self._python_function, c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\framework\func_graph.py in func_graph_from_py_func(name, python_func, args, kwargs, signature, func_graph, autograph, autograph_options, add_control_dependencies, arg_names, op_return_value, collections, capture_by_value, override_flat_arg_shapes) 994 _, original_func = tf_decorator.unwrap(python_func) 995 --> 996 func_outputs = python_func(*func_args, **func_kwargs) 997 998 # invariant: `func_outputs` contains only Tensors, CompositeTensors, c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\eager\def_function.py in wrapped_fn(*args, **kwds) 588 xla_context.Exit() 589 else: --> 590 out = weak_wrapped_fn().__wrapped__(*args, **kwds) 591 return out 592 c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\framework\func_graph.py in wrapper(*args, **kwargs) 981 except Exception as e: # pylint:disable=broad-except 982 if hasattr(e, "ag_error_metadata"): --> 983 raise e.ag_error_metadata.to_exception(e) 984 else: 985 raise ValueError: in user code: c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\keras\engine\training.py:804 train_function * return step_function(self, iterator) c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\keras\engine\training.py:794 step_function ** outputs = model.distribute_strategy.run(run_step, args=(data,)) c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\distribute\distribute_lib.py:1259 run return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs) c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\distribute\distribute_lib.py:2730 call_for_each_replica return self._call_for_each_replica(fn, args, kwargs) c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\distribute\distribute_lib.py:3417 _call_for_each_replica return fn(*args, **kwargs) c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\keras\engine\training.py:787 run_step ** outputs = model.train_step(data) c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\keras\engine\training.py:753 train_step y_pred = self(x, training=True) c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\keras\engine\base_layer.py:1000 __call__ input_spec.assert_input_compatibility(self.input_spec, inputs, self.name) c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\keras\engine\input_spec.py:204 assert_input_compatibility raise ValueError('Layer ' + layer_name + ' expects ' + ValueError: Layer model expects 1 input(s), but it received 2 input tensors. Inputs received: [, ] ________________________________________________________________________________________________________________________________________ This Stackoverflow post (https://stackoverflow.com/questions/61586981/valueerror-layer-sequential-20-expects-1-inputs-but-it-received-2-input-tensor#) suggests it has something to do with fit() parameter validation_data. It points to a change in structural requirements, which has been changed from lists to tuples across tfv1.x and tfv2.x. However, we are not using a structure at all, but a generator to accomplish our task. I don't understand what is going wrong.
ukowa commented 3 years ago

Hi, thank you @suprateem48 for bringing this up. I'm actually facing the same issue. Please find my stack trace for reference below. Advice from anybody how too solve that issue is highly appreciated - thank you in advance!

existing dataset files found -> loading.... Loading labels: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 16551/16551 [00:03<00:00, 4725.84it/s] Loading image IDs: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 16551/16551 [00:01<00:00, 9289.97it/s] Loading evaluation-neutrality annotations: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 16551/16551 [00:02<00:00, 7384.34it/s] Loading labels: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4952/4952 [00:01<00:00, 4734.14it/s] Loading image IDs: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4952/4952 [00:00<00:00, 9295.89it/s] Loading evaluation-neutrality annotations: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4952/4952 [00:00<00:00, 7491.51it/s] Number of images in the training dataset: 16551 Number of images in the validation dataset: 4952 2021-05-03 07:25:05.965847: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2) Epoch 1/120

Epoch 00001: LearningRateScheduler reducing learning rate to 0.001. Traceback (most recent call last): File "D:\projects\python\ssd_test\ssd_test.py", line 288, in use_multiprocessing=False) File "C:\Program Files\Python\Python37\lib\site-packages\tensorflow\python\keras\engine\training.py", line 1100, in fit tmp_logs = self.train_function(iterator) File "C:\Program Files\Python\Python37\lib\site-packages\tensorflow\python\eager\def_function.py", line 828, in call result = self._call(*args, kwds) File "C:\Program Files\Python\Python37\lib\site-packages\tensorflow\python\eager\def_function.py", line 871, in _call self._initialize(args, kwds, add_initializers_to=initializers) File "C:\Program Files\Python\Python37\lib\site-packages\tensorflow\python\eager\def_function.py", line 726, in _initialize *args, *kwds)) File "C:\Program Files\Python\Python37\lib\site-packages\tensorflow\python\eager\function.py", line 2969, in _get_concrete_function_internal_garbage_collected graphfunction, = self._maybe_define_function(args, kwargs) File "C:\Program Files\Python\Python37\lib\site-packages\tensorflow\python\eager\function.py", line 3361, in _maybe_define_function graph_function = self._create_graph_function(args, kwargs) File "C:\Program Files\Python\Python37\lib\site-packages\tensorflow\python\eager\function.py", line 3206, in _create_graph_function capture_by_value=self._capture_by_value), File "C:\Program Files\Python\Python37\lib\site-packages\tensorflow\python\framework\func_graph.py", line 990, in func_graph_from_py_func func_outputs = python_func(func_args, func_kwargs) File "C:\Program Files\Python\Python37\lib\site-packages\tensorflow\python\eager\def_function.py", line 634, in wrapped_fn out = weak_wrapped_fn().wrapped(*args, **kwds) File "C:\Program Files\Python\Python37\lib\site-packages\tensorflow\python\framework\func_graph.py", line 977, in wrapper raise e.ag_error_metadata.to_exception(e) ValueError: in user code:

C:\Program Files\Python\Python37\lib\site-packages\tensorflow\python\keras\engine\training.py:805 train_function  *
    return step_function(self, iterator)
C:\Program Files\Python\Python37\lib\site-packages\tensorflow\python\keras\engine\training.py:795 step_function  **
    outputs = model.distribute_strategy.run(run_step, args=(data,))
C:\Program Files\Python\Python37\lib\site-packages\tensorflow\python\distribute\distribute_lib.py:1259 run
    return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
C:\Program Files\Python\Python37\lib\site-packages\tensorflow\python\distribute\distribute_lib.py:2730 call_for_each_replica
    return self._call_for_each_replica(fn, args, kwargs)
C:\Program Files\Python\Python37\lib\site-packages\tensorflow\python\distribute\distribute_lib.py:3417 _call_for_each_replica
    return fn(*args, **kwargs)
C:\Program Files\Python\Python37\lib\site-packages\tensorflow\python\keras\engine\training.py:788 run_step  **
    outputs = model.train_step(data)
C:\Program Files\Python\Python37\lib\site-packages\tensorflow\python\keras\engine\training.py:754 train_step
    y_pred = self(x, training=True)
C:\Program Files\Python\Python37\lib\site-packages\tensorflow\python\keras\engine\base_layer.py:998 __call__
    input_spec.assert_input_compatibility(self.input_spec, inputs, self.name)
C:\Program Files\Python\Python37\lib\site-packages\tensorflow\python\keras\engine\input_spec.py:207 assert_input_compatibility
    ' input tensors. Inputs received: ' + str(inputs))

ValueError: Layer model expects 1 input(s), but it received 2 input tensors. Inputs received: [<tf.Tensor 'IteratorGetNext:0' shape=(None, None, None, None) dtype=uint8>, <tf.Tensor 'IteratorGetNext:1' shape=(None, None, None) dtype=float32>]
bfhaha commented 3 years ago

Hello. I had the same problem and I racked my brain these days. Finally, I use validation_data=tuple(val_generator), instead of validation_data=val_generator, The error has been solved. But I run out my memory (Google Colab Free Version) and looking for another environment.

By the way, in my case, the command history = model.fit_generator(generator=train_generator, doesn't work anymore, I have to use history = model.fit(train_generator,

suprateembanerjee commented 3 years ago

@bfhaha I tried your fix, but it did not solve the issue for me. Running the code `initial_epoch = 0 final_epoch = 120 steps_per_epoch = 1000

history = model.fit(x=train_generator, steps_per_epoch=steps_per_epoch, epochs=final_epoch, callbacks=callbacks, validation_data=tuple(val_generator), validation_steps=ceil(val_dataset_size/batch_size), initial_epoch=initial_epoch)`

still results in

ValueError Traceback (most recent call last)

in 3 steps_per_epoch = 1000 4 ----> 5 history = model.fit(x=train_generator, 6 steps_per_epoch=steps_per_epoch, 7 epochs=final_epoch, c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\keras\engine\training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_batch_size, validation_freq, max_queue_size, workers, use_multiprocessing) 1097 _r=1): 1098 callbacks.on_train_batch_begin(step) -> 1099 tmp_logs = self.train_function(iterator) 1100 if data_handler.should_sync: 1101 context.async_wait() c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\eager\def_function.py in __call__(self, *args, **kwds) 782 tracing_count = self.experimental_get_tracing_count() 783 with trace.Trace(self._name) as tm: --> 784 result = self._call(*args, **kwds) 785 compiler = "xla" if self._experimental_compile else "nonXla" 786 new_tracing_count = self.experimental_get_tracing_count() c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\eager\def_function.py in _call(self, *args, **kwds) 816 # In this case we have not created variables on the first call. So we can 817 # run the first trace but we should fail if variables are created. --> 818 results = self._stateful_fn(*args, **kwds) 819 if self._created_variables: 820 raise ValueError("Creating variables on a non-first call to a function" c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\eager\function.py in __call__(self, *args, **kwargs) 2967 with self._lock: 2968 (graph_function, -> 2969 filtered_flat_args) = self._maybe_define_function(args, kwargs) 2970 return graph_function._call_flat( 2971 filtered_flat_args, captured_inputs=graph_function.captured_inputs) # pylint: disable=protected-access c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\eager\function.py in _maybe_define_function(self, args, kwargs) 3383 self.input_signature is None and 3384 call_context_key in self._function_cache.missed): -> 3385 return self._define_function_with_shape_relaxation( 3386 args, kwargs, flat_args, filtered_flat_args, cache_key_context) 3387 c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\eager\function.py in _define_function_with_shape_relaxation(self, args, kwargs, flat_args, filtered_flat_args, cache_key_context) 3305 expand_composites=True) 3306 -> 3307 graph_function = self._create_graph_function( 3308 args, kwargs, override_flat_arg_shapes=relaxed_arg_shapes) 3309 self._function_cache.arg_relaxed[rank_only_cache_key] = graph_function c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\eager\function.py in _create_graph_function(self, args, kwargs, override_flat_arg_shapes) 3222 arg_names = base_arg_names + missing_arg_names 3223 graph_function = ConcreteFunction( -> 3224 func_graph_module.func_graph_from_py_func( 3225 self._name, 3226 self._python_function, c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\framework\func_graph.py in func_graph_from_py_func(name, python_func, args, kwargs, signature, func_graph, autograph, autograph_options, add_control_dependencies, arg_names, op_return_value, collections, capture_by_value, override_flat_arg_shapes) 994 _, original_func = tf_decorator.unwrap(python_func) 995 --> 996 func_outputs = python_func(*func_args, **func_kwargs) 997 998 # invariant: `func_outputs` contains only Tensors, CompositeTensors, c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\eager\def_function.py in wrapped_fn(*args, **kwds) 588 xla_context.Exit() 589 else: --> 590 out = weak_wrapped_fn().__wrapped__(*args, **kwds) 591 return out 592 c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\framework\func_graph.py in wrapper(*args, **kwargs) 981 except Exception as e: # pylint:disable=broad-except 982 if hasattr(e, "ag_error_metadata"): --> 983 raise e.ag_error_metadata.to_exception(e) 984 else: 985 raise ValueError: in user code: c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\keras\engine\training.py:804 train_function * return step_function(self, iterator) c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\keras\engine\training.py:794 step_function ** outputs = model.distribute_strategy.run(run_step, args=(data,)) c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\distribute\distribute_lib.py:1259 run return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs) c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\distribute\distribute_lib.py:2730 call_for_each_replica return self._call_for_each_replica(fn, args, kwargs) c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\distribute\distribute_lib.py:3417 _call_for_each_replica return fn(*args, **kwargs) c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\keras\engine\training.py:787 run_step ** outputs = model.train_step(data) c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\keras\engine\training.py:753 train_step y_pred = self(x, training=True) c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\keras\engine\base_layer.py:1000 __call__ input_spec.assert_input_compatibility(self.input_spec, inputs, self.name) c:\users\dolphin48\.conda\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\keras\engine\input_spec.py:204 assert_input_compatibility raise ValueError('Layer ' + layer_name + ' expects ' + ValueError: Layer model expects 1 input(s), but it received 2 input tensors. Inputs received: [, ]
bfhaha commented 3 years ago

@suprateem48 Sorry. I really don't know where is the problem. If I were you, I would try the commands print(val_generator) # It is supposed to be <generator object DataGenerator.generate at 0x7f33b9691a50> and print(tuple(val_generator)) # It is supposed to be () after defining val_generator to observe the difference.

suprateembanerjee commented 3 years ago

@bfhaha Weird thing, I managed to reproduce your memory issue even on my 8GB RTX 2070 Super, but this error is given only for the first time the kernel runs model.fit(). Every consecutive time model.fit() is rerun on the same kernel, it throws the old tuple-related error.

ukowa commented 3 years ago

@bfhaha thanks for your fix. I've tried it as well. Same here: memory error (32GB RAM Predator, GForce 1070). I gave it a second try with a reduced data set of just 8 images, but same result. I know it doesn't really help, but I just wanted to share the information...

bfhaha commented 3 years ago

Hello. I have tried to run the notebook on a Google Compute Engine (E2 series, e2-highmem-16, 16vCPU, 128 GB memory) 80 GB disk. It also crashed... I was running ssd7_training.ipynb, not ssd300.

bfhaha commented 3 years ago

I had rented a Google Compute Engine (N2 series, custom 8 vCPU, 640 GB memory, 200 GB Disk) yesterday and showed the following message after running one hour.

---------------------------------------------------------------------------
MemoryError                               Traceback (most recent call last)
<ipython-input-24-fb2ae06e0e3c> in <module>
      9                               epochs=final_epoch,
     10                               callbacks=callbacks,
---> 11                               validation_data=tuple(val_generator),
     12                               validation_steps=ceil(val_dataset_size/batch_size),
     13                               initial_epoch=initial_epoch)

~/data_generator/object_detection_2d_data_generator.py in generate(self, batch_size, shuffle, transformations, label_encoder, returns, keep_images_without_gt, degenerate_box_handling)
   1149                     batch_y_encoded, batch_matched_anchors = label_encoder(batch_y, diagnostics=True)
   1150                 else:
-> 1151                     batch_y_encoded = label_encoder(batch_y, diagnostics=False)
   1152                     batch_matched_anchors = None
   1153 

~/ssd_encoder_decoder/ssd_input_encoder.py in __call__(self, ground_truth_labels, diagnostics)
    311         ##################################################################################
    312 
--> 313         y_encoded = self.generate_encoding_template(batch_size=batch_size, diagnostics=False)
    314 
    315         ##################################################################################

~/ssd_encoder_decoder/ssd_input_encoder.py in generate_encoding_template(self, batch_size, diagnostics)
    604         #    shape as the SSD model output tensor. The content of this tensor is irrelevant, we'll just use
    605         #    `boxes_tensor` a second time.
--> 606         y_encoding_template = np.concatenate((classes_tensor, boxes_tensor, boxes_tensor, variances_tensor), axis=2)
    607 
    608         if diagnostics:

<__array_function__ internals> in concatenate(*args, **kwargs)

MemoryError: Unable to allocate 25.7 MiB for an array with shape (16, 11692, 18) and data type float64
suprateembanerjee commented 3 years ago

@bfhaha Yes,this is the exact Memory related issue I faced as well.

bfhaha commented 3 years ago

@suprateem48 Have you ever tried this solution? https://stackoverflow.com/questions/57507832/unable-to-allocate-array-with-shape-and-data-type I don't have enough money to rent the VM instance to test again.

JuliusJacobitz commented 3 years ago

Same issue right here, will update if I can find something.

Edit: It seems like calling next(val_generator) is infinite ? Not quite sure why. But calling tuple() on an infinite generator will cause a memory error.

JuliusJacobitz commented 3 years ago

By the way, in my case, the command history = model.fit_generator(generator=train_generator, doesn't work anymore, I have to use history = model.fit(train_generator, @bfhaha could you show me how you used model.fit instead of model.fit_generator ? :)

bfhaha commented 3 years ago

@JuliusJacobitz Sorry. I don't understand what you mean "how" I used model.fit. It showed the following message when I was using history = model.fit_generator(generator=train_generator UserWarning: Model.fit_generator is deprecated and will be removed in a future version. Please use Model.fit, which supports generators.

So I just changed model.fit_generator to model.fit The example at here shows that we don't have to use generator= if we use model.fit

pirolone888 commented 3 years ago

Hi, I had struggled same issue.

The reason was that the return of the data_generator was [batch_X, batch_y_encoded]. I changed the DataGenerator class, generate() in object_detection_2d_data_generator.py to the following. Instead of returning a list, it returns two returns, batch_X and batch_y_encoded. It's a primitive solution, but it works fine.

  #########################################################################################
  # Compose the output.
  #########################################################################################

  ret = []
  if 'processed_images' in returns: ret.append(batch_X)
  if 'encoded_labels' in returns: ret.append(batch_y_encoded)
  if 'matched_anchors' in returns: ret.append(batch_matched_anchors)
  if 'processed_labels' in returns: ret.append(batch_y)
  if 'filenames' in returns: ret.append(batch_filenames)
  if 'image_ids' in returns: ret.append(batch_image_ids)
  if 'evaluation-neutral' in returns: ret.append(batch_eval_neutral)
  if 'inverse_transform' in returns: ret.append(batch_inverse_transforms)
  if 'original_images' in returns: ret.append(batch_original_images)
  if 'original_labels' in returns: ret.append(batch_original_labels)

  yield batch_X, batch_y_encoded # do not yield ret

FYI: Here is my model.fit.

history = model.fit(train_generator,
                    steps_per_epoch=ceil(train_dataset_size/batch_size),
                    epochs=final_epoch,
                    callbacks=callbacks,
                    validation_data=val_generator,
                    validation_steps=ceil(val_dataset_size/batch_size),
                    initial_epoch=initial_epoch,
                    verbose=1)
bfhaha commented 3 years ago

@pirolone888 Thanks. But it showed UnboundLocalError: local variable 'batch_X' referenced before assignment when I was trying your method.

daviddanialy commented 3 years ago

Hello, I also converted the code from tensorflow 1.x to tensorflow 2.4. I fixed the problem you are having by changing the DataGenerator in object_detection_2d_data_generator.py as such:

ret = [] if 'processed_images' in returns: ret.append(batch_X) if 'encoded_labels' in returns: ret.append(batch_y_encoded) if 'matched_anchors' in returns: ret.append(batch_matched_anchors) if 'processed_labels' in returns: ret.append(batch_y) if 'filenames' in returns: ret.append(batch_filenames) if 'image_ids' in returns: ret.append(batch_image_ids) if 'evaluation-neutral' in returns: ret.append(batch_eval_neutral) if 'inverse_transform' in returns: ret.append(batch_inverse_transforms) if 'original_images' in returns: ret.append(batch_original_images) if 'original_labels' in returns: ret.append(batch_original_labels)

        yield tuple(ret)

I simply changed yield ret to yield tuple(ret).

bfhaha commented 3 years ago

@daviddanialy Thanks. So just place the following code under the function generate in object_detection_2d_data_generator.py? (It has been indented by eight spaces.)

        ret = []
        if 'processed_images' in returns: ret.append(batch_X)
        if 'encoded_labels' in returns: ret.append(batch_y_encoded)
        if 'matched_anchors' in returns: ret.append(batch_matched_anchors)
        if 'processed_labels' in returns: ret.append(batch_y)
        if 'filenames' in returns: ret.append(batch_filenames)
        if 'image_ids' in returns: ret.append(batch_image_ids)
        if 'evaluation-neutral' in returns: ret.append(batch_eval_neutral)
        if 'inverse_transform' in returns: ret.append(batch_inverse_transforms)
        if 'original_images' in returns: ret.append(batch_original_images)
        if 'original_labels' in returns: ret.append(batch_original_labels)

        yield tuple(ret)

It still showed the original error message (Layer model expects 1 input(s), but it received 2 input tensors...).

I have already given up trying this project and trying matterport's mask rcnn for object detection.

daviddanialy commented 3 years ago

That code is already in the generate function, you just change yield ret to yield tuple(ret). I may have to switch to a different repo as well, because I'm having issues with the predictions not ever exceeding the confidence threshold.

bfhaha commented 3 years ago

@daviddanialy Thanks. It doesn't work for me.

Hrrsmjd commented 3 years ago

Any solutions? Here is my code:

# TODO: Set the epochs to train for.
# If you're resuming a previous training, set `initial_epoch` and `final_epoch` accordingly.
initial_epoch   = 0
final_epoch     = 20
steps_per_epoch = 1000

history = model.fit(train_generator,
                    steps_per_epoch=steps_per_epoch,
                    epochs=final_epoch,
                    callbacks=callbacks,
                    validation_data=val_generator,
                    validation_steps=ceil(val_dataset_size/batch_size),
                    initial_epoch=initial_epoch)
            ret = []
            if 'processed_images' in returns: ret.append(batch_X)
            if 'encoded_labels' in returns: ret.append(batch_y_encoded)
            if 'matched_anchors' in returns: ret.append(batch_matched_anchors)
            if 'processed_labels' in returns: ret.append(batch_y)
            if 'filenames' in returns: ret.append(batch_filenames)
            if 'image_ids' in returns: ret.append(batch_image_ids)
            if 'evaluation-neutral' in returns: ret.append(batch_eval_neutral)
            if 'inverse_transform' in returns: ret.append(batch_inverse_transforms)
            if 'original_images' in returns: ret.append(batch_original_images)
            if 'original_labels' in returns: ret.append(batch_original_labels)

            yield ret

I have tried ret, tuple(ret), [ret], and I still get the following:

ValueError: Layer model expects 1 input(s), but it received 2 input tensors. Inputs received: [<tf.Tensor 'IteratorGetNext:0' shape=(None, None, None, None) dtype=uint8>, <tf.Tensor 'IteratorGetNext:1' shape=(None, None, None) dtype=float32>]
jy0821 commented 2 years ago

Hello. I had the same problem and I racked my brain these days. Finally, I use validation_data=tuple(val_generator), instead of validation_data=val_generator, The error has been solved. But I run out my memory (Google Colab Free Version) and looking for another environment.

By the way, in my case, the command history = model.fit_generator(generator=train_generator, doesn't work anymore, I have to use history = model.fit(train_generator,

Tried that. Not seem to work for me...

ZFTurbo commented 2 years ago

I solved the same issue for model.predict() for case when model has 2 inputs.

My generator ouput was: return (a, b)

I changed it to: return ((a, b), None)

generalMG commented 1 year ago

hello! since the model.fit and model.fit_generator are essentially functions for the repeated loop over several epochs, I abandoned using it and, instead, used customized for loop and enumerating the generated dataset (i mostly use pytorch, therefore, it is more convenient for me to for loop).

Figure 1

Screen Shot 2023-01-12 at 11 41 16 AM

Above, I am using customized dataset generator (I modified this python code: https://github.com/wjddyd66/Tensorflow2.0/blob/master/SSD/voc_data.py), where I send generated dataset to the training code below:

Screen Shot 2023-01-12 at 11 42 00 AM

this way I was able to fix the problem above. Hope it helps to you guys!