Closed Paryavi closed 1 year ago
What Colab AI suggests;
The error is coming from the keras_cv.metrics.object_detection.box_coco_metrics module. The _box_concat function is expecting two tensors with the same shape, but the shapes of the tensors in your input are not the same. The first tensor has a shape of [4, 2, 4], while the second tensor has a shape of [4, 1, 4].
To fix this error, you need to make sure that the two tensors have the same shape. One way to do this is to use the tf.expand_dims function to add a new dimension to the second tensor. For example, you could use the following code:
tf.expand_dims(val_ds, axis=1) Use code with caution This will add a new dimension to the second tensor, making it have the same shape as the first tensor. Once you have done this, you should be able to run the yolo.fit function without any errors.
This is the same error I got in our main OD tutorial: #2017. Good to know if also affects the YOLOv8 example.
I'm taking a look -- thanks for the issue report!
This is the offending line for reference: https://github.com/keras-team/keras-cv/blob/2ff8e3fd764bc67342778894cc984daac95c4813/keras_cv/metrics/object_detection/box_coco_metrics.py#L44
This is the offending line for reference:
Thanks! It looks like this just expects padded boxes and the tutorial is not padding them correctly. Probably what happened is that something in KerasCV used to be turning them into Dense tensors and padding them but for some reason isn't anymore.
Great, let's add @LukeWood and @IMvision12 to the loop! I kinda give up on Yolo yesterday. Today, I am trying to train based on KerasCV RetinaNet example; https://lukewood.xyz/blog/marine-animal-detection I was able to split my labelbox json to 3 folders as Luke wood example, and I made the generator work(I was using CPU in colab, then with gpu generator and then visualization function works!). my bbox format is xywh, when doing model.fit I get this error;
history = model.fit( train_ds.take(1), validation_data=eval_ds.take(1), epochs=1 # EPOCHS )
Btw, this is how I modified the generator;
def load(*, split, bounding_box_format):
if split not in splits:
raise ValueError(
f"Invalid split provided, `split={split}`. "
f"Expected one of {list(splits.keys())}"
)
path = splits[split]
with open(os.path.join(path, 'annotations.json'), 'r') as f:
file_annotations = json.load(f)
# Create a dictionary to map image_ids to image file paths for quick lookup
image_id_to_file_path = {img['id']: img['file_name'] for img in file_annotations['images']}
def generator():
for image_entry in file_annotations['images']:
image_id = image_entry['id']
image_path = image_id_to_file_path.get(image_id, None)
if not image_path:
continue
annotations_for_image = [anno for anno in file_annotations['annotations'] if anno['image_id'] == image_id]
box_labels = []
class_labels = []
for annotation in annotations_for_image:
box = annotation['bbox']
box = tf.constant([float(coord) for coord in box], tf.float32)
box_labels.append(box)
class_labels.append(tf.constant(float(annotation['category_id']), tf.float32))
if not box_labels:
continue
bounding_boxes = {
'boxes': tf.stack(box_labels),
'classes': tf.stack(class_labels)
}
image = load_image(os.path.join(path, image_path))
bounding_boxes = keras_cv.bounding_box.convert_format(bounding_boxes, source='xywh', target=bounding_box_format)
yield {
'images': image,
'bounding_boxes': bounding_boxes
}
output_spec = {
'images': tf.TensorSpec(shape=(None, None, 3)),
'bounding_boxes': {
'boxes': tf.TensorSpec(shape=(None, 4)),
'classes': tf.TensorSpec(shape=(None,))
}
}
return tf.data.Dataset.from_generator(generator, output_signature=output_spec)
Error;
ValueError Traceback (most recent call last)
4 frames /usr/local/lib/python3.10/dist-packages/keras_cv/models/object_detection/retinanet/retinanet_label_encoder.py in tf_encode_sample(self, box_labels, anchor_boxes, image_shape) 53 batch_size = ag.ld(box_shape)[0] 54 n_boxes = ag.ld(box_shape)[1] ---> 55 box_ids = ag.converted_call(ag.ld(ops).arange, (ag.ld(gt_boxes).shape[1],), dict(dtype=ag.ld(matched_gt_idx).dtype), fscope) 56 matched_ids = ag__.converted_call(ag.ld(ops).expand_dims, (ag.ld(matched_gt_idx),), dict(axis=-1), fscope) 57 matches = ag.ld(box_ids) == ag__.ld(matched_ids)
ValueError: in user code:
File "/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py", line 1338, in train_function *
return step_function(self, iterator)
File "/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py", line 1322, in step_function **
outputs = model.distribute_strategy.run(run_step, args=(data,))
File "/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py", line 1303, in run_step **
outputs = model.train_step(data)
File "/usr/local/lib/python3.10/dist-packages/keras_cv/models/object_detection/retinanet/retinanet.py", line 465, in train_step
boxes, classes = self.label_encoder(x, y_for_label_encoder)
File "/usr/local/lib/python3.10/dist-packages/keras/src/utils/traceback_utils.py", line 70, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/tmp/__autograph_generated_filel16ndk94.py", line 48, in tf__call
result = ag__.converted_call(ag__.ld(self)._encode_sample, (ag__.ld(box_labels), ag__.ld(anchor_boxes), ag__.ld(image_shape)), None, fscope)
File "/tmp/__autograph_generated_filez4i80vy6.py", line 55, in tf___encode_sample
box_ids = ag__.converted_call(ag__.ld(ops).arange, (ag__.ld(gt_boxes).shape[1],), dict(dtype=ag__.ld(matched_gt_idx).dtype), fscope)
ValueError: Exception encountered when calling layer 'retina_net_label_encoder_3' (type RetinaNetLabelEncoder).
in user code:
File "/usr/local/lib/python3.10/dist-packages/keras_cv/models/object_detection/retinanet/retinanet_label_encoder.py", line 215, in call *
result = self._encode_sample(box_labels, anchor_boxes, image_shape)
File "/usr/local/lib/python3.10/dist-packages/keras_cv/models/object_detection/retinanet/retinanet_label_encoder.py", line 169, in _encode_sample *
box_ids = ops.arange(gt_boxes.shape[1], dtype=matched_gt_idx.dtype)
ValueError: None values not supported.
Call arguments received by layer 'retina_net_label_encoder_3' (type RetinaNetLabelEncoder):
• images=tf.Tensor(shape=(None, 640, 640, 3), dtype=float32)
• box_labels={'boxes': 'tf.RaggedTensor(values=Tensor("RaggedFromVariant/RaggedTensorFromVariant:1", shape=(None, None), dtype=float32), row_splits=Tensor("RaggedFromVariant/RaggedTensorFromVariant:0", shape=(None,), dtype=int64))', 'classes': 'tf.RaggedTensor(values=Tensor("RaggedFromVariant_1/RaggedTensorFromVariant:1", shape=(None,), dtype=float32), row_splits=Tensor("RaggedFromVariant_1/RaggedTensorFromVariant:0", shape=(None,), dtype=int64))'}
If needed I can email you guys my Colab codes for YoLo, and RetinaNet and the data; images(it's very small 20KB images) and annotations; JSON files, XMLS(in Yolo case) that you can mount using Google Drive. But I also think it should be the padding issue.
I also think it is because my images are small, the padding gets the issue?
My dataset has around 6k images with these pixel values;
Meanwhile, I will try to write a Python script to resize the images and my xywh bounding boxes to 640 by 640 pixels. If there is a code for that let me know, since I have 3 sizes of images as mentioned above.
@Paryavi I would recommend filing a new issue for each problem. We want each issue to have a clear deliverable and scope.
@Paryavi my recommendation would be to use the PyCOCOCallback
for metric evaluation, as BoxCOCOMetrics
were created when we were TF-only and won't support Torch+JAX, so they are likely to be deprecated.
For the marine_animal blog example, the solution to ragged tensors problem was using .to_dense() function as follows;
First I imported from keras_cv import bounding_box
Then before compile, I added
def dict_to_tuple(inputs): return inputs["images"], bounding_box.to_dense( inputs["bounding_boxes"], max_boxes=32 )
Reference; https://keras.io/guides/keras_cv/object_detection_keras_cv/ Thanks to @LukeWood
But I guess both the models should work with Ragged Tensors, can you try with ragged tenors? @Paryavi
model.fit does not work so far with my dataset @ianstenbit I am not using Torch+JAX, I use this Yolo example Tensorflow backend. Is there a pycococallback implementation sample? I will search in Keras API for it.
I found pycoco_callback; https://github.com/keras-team/keras-cv/blob/master/keras_cv/callbacks/pycoco_callback_test.py And an example to use it; https://github.com/keras-team/keras-cv/blob/master/examples/training/object_detection/pascal_voc/retinanet.py Will try to modify the Yolo code and see if I can fix it. and here is how it worked by modifying the following;
from keras_cv.callbacks import PyCOCOCallback
def dict_to_tuple(inputs): return inputs["images"], bounding_box.to_dense( inputs["bounding_boxes"], max_boxes=32 )
train_ds = train_ds.map(dict_to_tuple, num_parallel_calls=tf.data.AUTOTUNE) val_ds = val_ds.map(dict_to_tuple, num_parallel_calls=tf.data.AUTOTUNE)
train_ds = train_ds.prefetch(tf.data.AUTOTUNE) val_ds = val_ds.prefetch(tf.data.AUTOTUNE) callback = PyCOCOCallback( validation_data=val_ds, bounding_box_format="xywh", ) yolo.fit( train_ds, validation_data=val_ds, epochs=2, callbacks=[callback], )
@ianstenbit is RaggedTensors supported by keras-core?
@ianstenbit is RaggedTensors supported by keras-core?
Ragged Tensors will work with KerasCV when using Keras Core with the TF backend. It will not work for other backends.
Thanks @ianstenbit, So @IMvision12, in your Yolo example did you use Keras Core with the TF backend or Keras Core with the other backends? I guess just the imports should be different in these two choices.
Yeah I am updating the yolov8 example here : https://github.com/keras-team/keras-io/pull/1514
That's cool, keep hammering! A different question; for yolov8, what would you do if the images were very small, 400 by 600 pixels? Should I modify the filters(kernels), or freeze more/less layers, how? Where would you look (for a concrete code) how to finetune the yolov8 model if mAP was not good?
@Paryavi please file another issue ;)
import tensorflow as tf import random import pandas as pd # Add this import from tensorflow.keras.optimizers import Adam
num_episodes = 100 num_support_samples_per_class = 5 num_query_samples_per_class = 5 few_shot_learning_rate = 0.001
input_shape = (224, 224)
few_shot_optimizer = Adam(learning_rate=few_shot_learning_rate)
for episode in range(num_episodes):
support_set, query_set = create_episode(
num_support_samples_per_class, num_query_samples_per_class
)
# Create data generators for the support and query sets.
# Create data generators for the support and query sets.
batch_size = 5 # Set your desired batch size
support_data_generator = create_data_generator(support_set, batch_size=batch_size)
query_data_generator = create_data_generator(query_set, batch_size=batch_size)
#query_data_generator = create_data_generator(query_set, batch_size=5)
#support_data_generator = create_data_generator(support_set, input_shape, batch_size=num_support_samples_per_class)
#query_data_generator = create_data_generator(query_set, input_shape, batch_size=num_query_samples_per_class)
# Train the few-shot model on the support set.
few_shot_model.compile(optimizer=few_shot_optimizer, loss='categorical_crossentropy', metrics=['accuracy'])
few_shot_model.fit(support_data_generator, epochs=1, verbose=0)
# Evaluate the few-shot model on the query set.
evaluation_metrics = few_shot_model.evaluate(query_data_generator, verbose=0)
accuracy = evaluation_metrics[1] # Assuming accuracy is the second metric in the list.
print(f"Episode {episode + 1}: Accuracy = {accuracy:.2%}")
# Update the few-shot model's weights (you can implement your own update logic here)
Found 50 validated image filenames belonging to 10 classes. Found 50 validated image filenames belonging to 10 classes.
ValueError Traceback (most recent call last)
1 frames
/usr/local/lib/python3.10/dist-packages/keras/src/utils/traceback_utils.py in error_handler(*args, **kwargs)
68 # To get the full stack trace, call:
69 # tf.debugging.disable_traceback_filtering()
---> 70 raise e.with_traceback(filtered_tb) from None
71 finally:
72 del filtered_tb
/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py in tftrain_function(iterator) 13 try: 14 doreturn = True ---> 15 retval = ag__.converted_call(ag.ld(step_function), (ag.ld(self), ag.ld(iterator)), None, fscope) 16 except: 17 do_return = False
ValueError: in user code:
File "/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py", line 1338, in train_function *
return step_function(self, iterator)
File "/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py", line 1322, in step_function **
outputs = model.distribute_strategy.run(run_step, args=(data,))
File "/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py", line 1303, in run_step **
outputs = model.train_step(data)
File "/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py", line 1081, in train_step
loss = self.compute_loss(x, y, y_pred, sample_weight)
File "/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py", line 1139, in compute_loss
return self.compiled_loss(
File "/usr/local/lib/python3.10/dist-packages/keras/src/engine/compile_utils.py", line 265, in __call__
loss_value = loss_obj(y_t, y_p, sample_weight=sw)
File "/usr/local/lib/python3.10/dist-packages/keras/src/losses.py", line 142, in __call__
losses = call_fn(y_true, y_pred)
File "/usr/local/lib/python3.10/dist-packages/keras/src/losses.py", line 268, in call **
return ag_fn(y_true, y_pred, **self._fn_kwargs)
File "/usr/local/lib/python3.10/dist-packages/keras/src/losses.py", line 2122, in categorical_crossentropy
return backend.categorical_crossentropy(
File "/usr/local/lib/python3.10/dist-packages/keras/src/backend.py", line 5560, in categorical_crossentropy
target.shape.assert_is_compatible_with(output.shape)
ValueError: Shapes (None, None) and (None, None, None, 5) are incompatible
My images are small, it is 6k images with following sizes; {(400, 296): 3484, (480, 320): 2763, (640, 480): 108}
The error I get after finishing training in 1 epoch(I train in Googl Colab), initially loss reduces but after end of first training epoch it crashes as follows;
Epoch 1/3 1271/1271 [==============================] - ETA: 0s - loss: 21.5977 - box_loss: 2.6112 - class_loss: 18.9865 UnknownError Traceback (most recent call last) in <cell line: 1>() ----> 1 yolo.fit( 2 train_ds, 3 validation_data=val_ds, 4 epochs=3, 5 callbacks=[EvaluateCOCOMetricsCallback(val_ds, "model.h5")],
1 frames in on_epoch_end(self, epoch, logs) 18 self.metrics.update_state(y_true, y_pred) 19 ---> 20 metrics = self.metrics.result(force=True) 21 logs.update(metrics) 22
UnknownError: {{function_node _wrapped__EagerPyFunc_Tin_1_Tout_1_device/job:localhost/replica:0/task:0/device:CPU:0}} InvalidArgumentError: {{function_node _wrapped__ConcatV2_N_317_device/job:localhost/replica:0/task:0/device:CPU:0}} ConcatOp : Dimension 1 in both shapes must be equal: shape[0] = [4,2,4] vs. shape[1] = [4,1,4] [Op:ConcatV2] name: concat Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/ops/script_ops.py", line 265, in call return func(device, token, args)
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/ops/script_ops.py", line 143, in call outputs = self._call(device, args)
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/ops/script_ops.py", line 150, in _call ret = self._func(*args)
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/autograph/impl/api.py", line 642, in wrapper return func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/keras_cv/metrics/object_detection/box_coco_metrics.py", line 203, in result_on_host_cpu return tf.constant(obj_result(force), obj.dtype)
File "/usr/local/lib/python3.10/dist-packages/keras_cv/metrics/object_detection/box_coco_metrics.py", line 254, in result self._cached_result = self._compute_result()
File "/usr/local/lib/python3.10/dist-packages/keras_cv/metrics/object_detection/box_coco_metrics.py", line 262, in _compute_result _box_concat(self.ground_truths),
File "/usr/local/lib/python3.10/dist-packages/keras_cv/metrics/object_detection/box_coco_metrics.py", line 44, in _box_concat result[key] = tf.concat([b[key] for b in boxes], axis=0)
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler raise e.with_traceback(filtered_tb) from None
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/framework/ops.py", line 7262, in raise_from_not_ok_status raise core._status_to_exception(e) from None # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.InvalidArgumentError: {{function_node _wrapped__ConcatV2_N_317_device/job:localhost/replica:0/task:0/device:CPU:0}} ConcatOp : Dimension 1 in both shapes must be equal: shape[0] = [4,2,4] vs. shape[1] = [4,1,4] [Op:ConcatV2] name: concat
[Op:EagerPyFunc]