leondgarse / keras_cv_attention_models

Keras beit,caformer,CMT,CoAtNet,convnext,davit,dino,efficientdet,edgenext,efficientformer,efficientnet,eva,fasternet,fastervit,fastvit,flexivit,gcvit,ghostnet,gpvit,hornet,hiera,iformer,inceptionnext,lcnet,levit,maxvit,mobilevit,moganet,nat,nfnets,pvt,swin,tinynet,tinyvit,uniformer,volo,vanillanet,yolor,yolov7,yolov8,yolox,gpt2,llama2, alias kecam
MIT License
598 stars 95 forks source link

training yolov8 with anchor-free anchor mode #124

Closed shshrzad closed 11 months ago

shshrzad commented 1 year ago

When I want to train yolov8 with considering anchor-free anchor mode an error corresponds with dimensions occurs. How can I solve this problem?

leondgarse commented 1 year ago

Can you provide the training command you using? Actually the coco_train_script.py with yolov8 is not tested...

shshrzad commented 1 year ago

Sure. It was the code that I used. "CUDA_VISIBLE_DEVICES='0' ./coco_train_script.py --det_header yolov8.YOLOV8_M --backbone iformer.IFormerSmall --freeze_backbone_epochs 0 --batch_size 4 -A anchor_free -e 200 -i 512 --summary store_true -d /home/shahrzad/miniconda3/envs/sr-det/keras_cv_attention_models-main_original/train_val_768.json --use_l1_loss store_true "

and the error is:

" Epoch 1/200 Traceback (most recent call last): File "/home/shahrzad/miniconda3/envs/sr-det/keras_cv_attention_models-main_yolo8/./coco_train_script.py", line 299, in run_training_by_args(args) File "/home/shahrzad/miniconda3/envs/sr-det/keras_cv_attention_models-main_yolo8/./coco_train_script.py", line 272, in run_training_by_args latest_save, hist = train( File "/home/shahrzad/miniconda3/envs/sr-det/keras_cv_attention_models-main_yolo8/keras_cv_attention_models/imagenet/train_func.py", line 270, in train hist = compiled_model.fit( File "/home/shahrzad/miniconda3/envs/sr-det/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 70, in error_handler raise e.with_traceback(filtered_tb) from None File "/tmp/autograph_generated_filel7qbqb50.py", line 15, in tftrainfunction retval = ag.converted_call(ag__.ld(step_function), (ag.ld(self), ag.ld(iterator)), None, fscope) File "/tmp/autograph_generated_filenh7s8e7w.py", line 31, in tfcall (class_loss, bbox_loss, object_loss, l1_loss, dfl_loss, num_valid, class_acc) = ag__.converted_call(ag.ld(tf).map_fn, (ag.ld(self).call_single, (ag.ld(y_true), ag.ld(y_pred))), dict(fn_output_signature=ag__.ld(out_dtype)), fscope) File "/tmp/autograph_generated_file4m076vdr.py", line 13, in tf__call_single_ retval = ag.converted_call(ag.ld(tf).cond, (ag.converted_call(ag.ld(tf).reduce_any, (ag.ld(bbox_labels_true)[:, -1] > 0,), None, fscope), ag.autograph_artifact(lambda : ag.converted_call(ag.ld(self).valid_call_single, (ag.ld(bbox_labels_true), ag.ld(bbox_labels_pred)), None, fscope)), ag.autograph_artifact(lambda : (0.0, 0.0, ag.converted_call(ag__.ld(tf).reduce_sum, (ag.converted_call(ag.ld(K).binary_crossentropy, (0.0, ag.ld(bbox_labels_pred)[:, -1]), None, fscope),), None, fscope), 0.0, 0.0, 0.0, 0.0))), None, fscope) File "/tmp/autograph_generatedfile4m076vdr.py", line 13, in retval = ag.converted_call(ag.ld(tf).cond, (ag.converted_call(ag.ld(tf).reduce_any, (ag.ld(bbox_labels_true)[:, -1] > 0,), None, fscope), ag.autograph_artifact(lambda : ag.converted_call(ag.ld(self).valid_call_single, (ag.ld(bbox_labels_true), ag.ld(bbox_labels_pred)), None, fscope)), ag.autograph_artifact(lambda : (0.0, 0.0, ag.converted_call(ag__.ld(tf).reduce_sum, (ag.converted_call(ag.ld(K).binary_crossentropy, (0.0, ag.ld(bbox_labels_pred)[:, -1]), None, fscope),), None, fscope), 0.0, 0.0, 0.0, 0.0))), None, fscope) File "/tmp/autograph_generated_file3f35y02q.py", line 10, in tf__valid_call_single bbox_labels_true_assined = ag.converted_call(ag.ld(tf).stop_gradient, (ag__.converted_call(ag.ld(self).anchor_assign, (ag.ld(bbox_labels_true), ag.ld(bbox_labels_pred)), None, fscope),), None, fscope) File "/tmp/autograph_generatedfile3wr9e9c.py", line 46, in tf__call cls_loss = ag.converted_call(ag.ld(K).binary_crossentropy, (ag__.converted_call(ag.ld(tf).expand_dims, (ag.ld(labels_true), 1), None, fscope), ag__.converted_call(ag.ld(tf).expand_dims, (ag__.ld(obj_labels_pred), 0), None, fscope)), None, fscope) ValueError: in user code:

File

"/home/shahrzad/miniconda3/envs/sr-det/lib/python3.9/site-packages/keras/engine/training.py", line 1249, in train_function return step_function(self, iterator) File "/home/shahrzad/miniconda3/envs/sr-det/keras_cv_attention_models-main_yolo8/keras_cv_attention_models/coco/losses.py", line 280, in call class_loss, bbox_loss, object_loss, l1_loss, dfl_loss, num_valid, class_acc = tf.map_fn( File "/home/shahrzad/miniconda3/envs/sr-det/keras_cv_attention_models-main_yolo8/keras_cv_attention_models/coco/losses.py", line 270, in call_single__ * lambda: (0.0, 0.0, tf.reduce_sum(K.binary_crossentropy(0.0, bbox_labels_pred[:, -1])), 0.0, 0.0, 0.0, 0.0), # Object loss only, target is all False File "/home/shahrzad/miniconda3/envs/sr-det/keras_cv_attention_models-main_yolo8/keras_cv_attention_models/coco/losses.py", line 227, in valid_call_single__ bbox_labels_true_assined = tf.stop_gradient(self.anchor_assign(bbox_labels_true, bbox_labels_pred)) File "/home/shahrzad/miniconda3/envs/sr-det/keras_cv_attention_models-main_yolo8/keras_cv_attention_models/coco/anchors_func.py", line 570, in call cls_loss = K.binary_crossentropy(tf.expand_dims(labels_true, 1), tf.expand_dims(obj_labels_pred, 0)) # [num_bboxes, num_picked_anchors, num_classes] File "/home/shahrzad/miniconda3/envs/sr-det/lib/python3.9/site-packages/keras/backend.py", line 5688, in binary_crossentropy bce = target * tf.math.log(output + epsilon())

ValueError: Dimensions must be equal, but are 6 and 66 for '{{node

AnchorFreeLoss/map/while/cond/mul_8}} = MulT=DT_FLOAT' with input shapes: [?,1,6], [1,?,66]. "

On Wed, Jun 28, 2023 at 4:54 PM leondgarse @.***> wrote:

Can you provide the training command you using? Actually the coco_train_script.py with yolov8 is not tested...

— Reply to this email directly, view it on GitHub https://github.com/leondgarse/keras_cv_attention_models/issues/124#issuecomment-1611408421, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMHD5FXNGPPQ7AMFRUDIH6DXNQWCVANCNFSM6AAAAAAZWU4B7A . You are receiving this because you authored the thread.Message ID: @.***>

leondgarse commented 1 year ago

This should be fixed, basic test in colab kecam_coco_tiny_test.ipynb. But still, the training performance cannot be guaranteed...

shshrzad commented 1 year ago

Thank you for your consideration.

Yes, I tested the code. It run without any errors but I think its performance isn't still reliable.

On Fri, Jun 30, 2023, 17:57 leondgarse @.***> wrote:

This should be fixed, basic test in colab kecam_coco_tiny_test.ipynb https://colab.research.google.com/drive/1m8exC3Jh9_gT8Ey5IKfoYRclfcLUf_rw?usp=drive_link. But still, the training performance cannot be guaranteed...

— Reply to this email directly, view it on GitHub https://github.com/leondgarse/keras_cv_attention_models/issues/124#issuecomment-1614735418, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMHD5FSQI42IJHVZDAFNML3XN3O53ANCNFSM6AAAAAAZWU4B7A . You are receiving this because you authored the thread.Message ID: @.***>

leondgarse commented 1 year ago

Actually, in most times my trying with coco training, using Tensorflow cannot match with original PyTorch one in both speed and performance. Thus I'm considering training using PyTorch only, and I have to say, this coco_train_script.py havn't been tested for a long time...

shshrzad commented 1 year ago

Thank you for your explanation. Do you think if I train my YOLOv8 model with anchor-free anchor mode in tensorflow framework, I may have incorrect results? What is your recommendation now?

leondgarse commented 1 year ago

At least in my implementation it's not as good as pytorch one. You may also refer keras-cv implementations YOLOV8 for Object Detection. I've updated a little longer training in colab kecam_coco_tiny_test.ipynb, which shows a resonable result. It's just some hyper parameters and dataset detail not set as pytorch one, and you have to try it out by yourself... So my recommendation is still the original pytorch one. Currently mine is still under testing and developing.

shshrzad commented 1 year ago

Thanks for the tip!

On Sun, Jul 2, 2023, 11:52 leondgarse @.***> wrote:

At least in my implementation it's not as good as pytorch one. You may also refer keras-cv implementations YOLOV8 for Object Detection https://github.com/keras-team/keras-cv/pull/1711. I've updated a little longer training in colab kecam_coco_tiny_test.ipynb https://colab.research.google.com/drive/1m8exC3Jh9_gT8Ey5IKfoYRclfcLUf_rw?usp=drive_link, which shows a resonable result. It's just some hyper parameters and dataset detail not set as pytorch one, and you have to try it out by yourself... So my recommendation is still the original pytorch one. Currently mine is still under testing and developing.

— Reply to this email directly, view it on GitHub https://github.com/leondgarse/keras_cv_attention_models/issues/124#issuecomment-1616438519, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMHD5FTMUET5DPLGZIWQE3DXOEVUJANCNFSM6AAAAAAZWU4B7A . You are receiving this because you authored the thread.Message ID: @.***>