CNES / decloud

Apache License 2.0
37 stars 3 forks source link

Problem in training the unet_all_bands #11

Open mboumahdi opened 1 year ago

mboumahdi commented 1 year ago

Hi, I'm training to train the model unet_all_bands to generate all sentinel-2 bands, I generated the TF records with 20m bands I used all the default parameters but I get this error

Call arguments received: • inputs=tf.Tensor(shape=(None, 256, 256, 4), dtype=float32) Traceback (most recent call last): File "/opt/otbtf/lib/python3/site-packages/tensorflow/python/training/coordinator.py", line 293, in stop_on_exception yield File "/opt/otbtf/lib/python3/site-packages/tensorflow/python/distribute/mirrored_run.py", line 342, in run self.main_result = self.main_fn(*self.main_args, *self.main_kwargs) File "/opt/otbtf/lib/python3/site-packages/tensorflow/python/autograph/impl/api.py", line 689, in wrapper return converted_call(f, args, kwargs, options=options) File "/opt/otbtf/lib/python3/site-packages/tensorflow/python/autograph/impl/api.py", line 377, in converted_call return _call_unconverted(f, args, kwargs, options) File "/opt/otbtf/lib/python3/site-packages/tensorflow/python/autograph/impl/api.py", line 458, in _call_unconverted return f(args, **kwargs) File "/opt/otbtf/lib/python3/site-packages/keras/engine/training.py", line 1783, in run_step outputs = model.predict_step(data) File "/opt/otbtf/lib/python3/site-packages/keras/engine/training.py", line 1751, in predict_step return self(x, training=False) File "/opt/otbtf/lib/python3/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler raise e.with_traceback(filtered_tb) from None File "/opt/otbtf/lib/python3/site-packages/keras/layers/convolutional.py", line 3556, in call raise ValueError('Argument cropping must be ' ValueError: Exception encountered when calling layer "s2_estim_pad256" (type Cropping2D).

Argument cropping must be greater than the input shape. Received: inputs.shape=(None, 256, 256, 4), and cropping=((256, 256), (256, 256))

Call arguments received: • inputs=tf.Tensor(shape=(None, 256, 256, 4), dtype=float32) 2023-08-14 21:42:23 INFO Error reported to Coordinator: Exception encountered when calling layer "s2_estim_pad256" (type Cropping2D).

Argument cropping must be greater than the input shape. Received: inputs.shape=(None, 256, 256, 4), and cropping=((256, 256), (256, 256))

Call arguments received: • inputs=tf.Tensor(shape=(None, 256, 256, 4), dtype=float32) Traceback (most recent call last): File "/opt/otbtf/lib/python3/site-packages/tensorflow/python/training/coordinator.py", line 293, in stop_on_exception yield File "/opt/otbtf/lib/python3/site-packages/tensorflow/python/distribute/mirrored_run.py", line 342, in run self.main_result = self.main_fn(*self.main_args, *self.main_kwargs) File "/opt/otbtf/lib/python3/site-packages/tensorflow/python/autograph/impl/api.py", line 689, in wrapper return converted_call(f, args, kwargs, options=options) File "/opt/otbtf/lib/python3/site-packages/tensorflow/python/autograph/impl/api.py", line 377, in converted_call return _call_unconverted(f, args, kwargs, options) File "/opt/otbtf/lib/python3/site-packages/tensorflow/python/autograph/impl/api.py", line 458, in _call_unconverted return f(args, **kwargs) File "/opt/otbtf/lib/python3/site-packages/keras/engine/training.py", line 1783, in run_step outputs = model.predict_step(data) File "/opt/otbtf/lib/python3/site-packages/keras/engine/training.py", line 1751, in predict_step return self(x, training=False) File "/opt/otbtf/lib/python3/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler raise e.with_traceback(filtered_tb) from None File "/opt/otbtf/lib/python3/site-packages/keras/layers/convolutional.py", line 3556, in call raise ValueError('Argument cropping must be ' ValueError: Exception encountered when calling layer "s2_estim_pad256" (type Cropping2D).

Argument cropping must be greater than the input shape. Received: inputs.shape=(None, 256, 256, 4), and cropping=((256, 256), (256, 256))

Call arguments received: • inputs=tf.Tensor(shape=(None, 256, 256, 4), dtype=float32) Traceback (most recent call last): File "models/train_from_tfrecords.py", line 262, in system.run_and_terminate(main) File "/usr/local/lib/python3.8/dist-packages/decloud/core/system.py", line 199, in run_and_terminate sys.exit(main(args=sys.argv[1:])) File "models/train_from_tfrecords.py", line 244, in main model.fit(tf_ds_train, File "/opt/otbtf/lib/python3/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler raise e.with_traceback(filtered_tb) from None File "/usr/local/lib/python3.8/dist-packages/decloud/core/summary.py", line 71, in on_epoch_end predicted = self.model.predict(self.test_data) ValueError: in user code:

File "/opt/otbtf/lib/python3/site-packages/keras/engine/training.py", line 1801, in predict_function  *
    return step_function(self, iterator)
File "/opt/otbtf/lib/python3/site-packages/keras/engine/training.py", line 1790, in step_function  **
    outputs = model.distribute_strategy.run(run_step, args=(data,))
File "/usr/local/lib/python3.8/dist-packages/six.py", line 719, in reraise
    raise value
File "/opt/otbtf/lib/python3/site-packages/keras/engine/training.py", line 1783, in run_step  **
    outputs = model.predict_step(data)
File "/opt/otbtf/lib/python3/site-packages/keras/engine/training.py", line 1751, in predict_step
    return self(x, training=False)
File "/opt/otbtf/lib/python3/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler
    raise e.with_traceback(filtered_tb) from None
File "/opt/otbtf/lib/python3/site-packages/keras/layers/convolutional.py", line 3556, in call
    raise ValueError('Argument `cropping` must be '

ValueError: Exception encountered when calling layer "s2_estim_pad256" (type Cropping2D).

Argument `cropping` must be greater than the input shape. Received: inputs.shape=(None, 256, 256, 4), and cropping=((256, 256), (256, 256))

Call arguments received:
  • inputs=tf.Tensor(shape=(None, 256, 256, 4), dtype=float32)
remicres commented 1 year ago

Hello @meryeme-25

The 20m bands have not been extensively tested for now, but it should work. Did you have succeed with 10m bands only? let me know

In the meantime we will investigate this issue

mboumahdi commented 1 year ago

Hello @remicres I tried also with 10m bands and I had the same problem

remicres commented 1 year ago

We can reproduce the problem with otbtf 4.1.0, it looks like something broke with recent Keras versions.

For now I only can suggest you to build decloud on an older otbtf image (3.2.0 or 3.3.0 should be fine).

We are investigating the problem right now

remicres commented 1 year ago

Hi @meryeme-25

We should have fixed the bug now. You should be able to work with decloud with the latest master branch!