keras-team / keras-io

Keras documentation, hosted live at keras.io
Apache License 2.0
2.74k stars 2.03k forks source link

Efficient Object Detection with YOLOV8 and KerasCV training issue #1475

Closed Paryavi closed 1 year ago

Paryavi commented 1 year ago

I get this error in Colab Pro, after training for one epoch on kerasCV YoloV8 example; https://keras.io/examples/vision/yolov8/

Epoch 1/3 1271/1271 [==============================] - ETA: 0s - loss: 21.5977 - box_loss: 2.6112 - class_loss: 18.9865

UnknownError Traceback (most recent call last) in <cell line: 1>() ----> 1 yolo.fit( 2 train_ds, 3 validation_data=val_ds, 4 epochs=3, 5 callbacks=[EvaluateCOCOMetricsCallback(val_ds, "model.h5")],

1 frames in on_epoch_end(self, epoch, logs) 18 self.metrics.update_state(y_true, y_pred) 19 ---> 20 metrics = self.metrics.result(force=True) 21 logs.update(metrics) 22

UnknownError: {{function_node wrappedEagerPyFunc_Tin_1_Tout_1device/job:localhost/replica:0/task:0/device:CPU:0}} InvalidArgumentError: {{function_node wrappedConcatV2_N_317device/job:localhost/replica:0/task:0/device:CPU:0}} ConcatOp : Dimension 1 in both shapes must be equal: shape[0] = [4,2,4] vs. shape[1] = [4,1,4] [Op:ConcatV2] name: concat Traceback (most recent call last):

File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/ops/script_ops.py", line 265, in call return func(device, token, args)

File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/ops/script_ops.py", line 143, in call outputs = self._call(device, args)

File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/ops/script_ops.py", line 150, in _call ret = self._func(*args)

File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/autograph/impl/api.py", line 642, in wrapper return func(*args, **kwargs)

File "/usr/local/lib/python3.10/dist-packages/keras_cv/metrics/object_detection/box_coco_metrics.py", line 203, in result_on_host_cpu return tf.constant(obj_result(force), obj.dtype)

File "/usr/local/lib/python3.10/dist-packages/keras_cv/metrics/object_detection/box_coco_metrics.py", line 254, in result self._cached_result = self._compute_result()

File "/usr/local/lib/python3.10/dist-packages/keras_cv/metrics/object_detection/box_coco_metrics.py", line 262, in _compute_result _box_concat(self.ground_truths),

File "/usr/local/lib/python3.10/dist-packages/keras_cv/metrics/object_detection/box_coco_metrics.py", line 44, in _box_concat result[key] = tf.concat([b[key] for b in boxes], axis=0)

File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler raise e.with_traceback(filtered_tb) from None

File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/framework/ops.py", line 7262, in raise_from_not_ok_status raise core._status_to_exception(e) from None # pylint: disable=protected-access

tensorflow.python.framework.errors_impl.InvalidArgumentError: {{function_node wrappedConcatV2_N_317device/job:localhost/replica:0/task:0/device:CPU:0}} ConcatOp : Dimension 1 in both shapes must be equal: shape[0] = [4,2,4] vs. shape[1] = [4,1,4] [Op:ConcatV2] name: concat

[Op:EagerPyFunc]

Paryavi commented 1 year ago

Colab AI suggests this;

Colab AI The error is coming from the keras_cv.metrics.object_detection.box_coco_metrics module. The _box_concat function is expecting two tensors with the same shape, but the shapes of the tensors in your input are not the same. The first tensor has a shape of [4, 2, 4], while the second tensor has a shape of [4, 1, 4].

To fix this error, you need to make sure that the two tensors have the same shape. One way to do this is to use the tf.expand_dims function to add a new dimension to the second tensor. For example, you could use the following code:

tf.expand_dims(val_ds, axis=1) Use code with caution This will add a new dimension to the second tensor, making it have the same shape as the first tensor. Once you have done this, you should be able to run the yolo.fit function without any errors.

meinternational commented 1 year ago

You can avoid the problem by not using RaggedTensorSpec for 'boxes' and 'classes'. Therefore, replace:

def dict_to_tuple(inputs):
    return inputs["images"], inputs["bounding_boxes"]

with (or similar):

def dict_to_tuple(inputs):
    return inputs["images"], bounding_box.to_dense(
        inputs["bounding_boxes"], max_boxes=32
    )

in the example code: https://keras.io/examples/vision/yolov8/

GaelMusoya commented 1 year ago

I had the same problem, in my case all I had to do it manually find corrupted images in my dataset an delete them. I tried writing the script for finding them, but it was not that good with other type of corruption that are visible only when viewing the image

Paryavi commented 1 year ago

Thanks, guys for the tips, I did same as I explained in a different thread; so to solve the problem I did not use ragged tensor also I used PyCOCOCallback; https://github.com/keras-team/keras-cv/issues/2025#issuecomment-1698399022

timoffenhaeusser commented 8 months ago

This helped me a lot! I had the same issue and I solved it by changing the dict_to_tuple function like @meinternational said. For max_boxes= I just used the number of boxes every one of my images has (In my case all images have the exact same number of boxes/objects).

And then I just needed to do pip install pycocotools.

And problem was solved.