Open Le0v1n opened 1 week ago
I used the script (yolo_world_v2_l_vlpan_bn_sgd_1e-3_40e_8gpus_finetune_coco.py) for fine tuning a custom dataset with only one class, and I also encountered similar problems. The loss was always large and could not be reduced.
grad_norm: nan loss: 194.9646 loss_cls: 69.6331 loss_bbox: 56.4941 loss_dfl: 68.8373
coco/bbox_mAP: 0.0020 coco/bbox_mAP_50: 0.0130 coco/bbox_mAP_75: 0.0000 coco/bbox_mAP_s: -1.0000 coco/bbox_mAP_m: -1.0000 coco/bbox_mAP_l: 0.0020
If I want to train my own dataset, the situation of the dataset is as follows:
It is clear that my dataset is a traditional object detection dataset. Now I want to fine-tune it with YOLO-World, and I want to confirm whether my thinking (steps) is correct:
third_party/mmyolo/tools/dataset_converters/yolo2coco.py
script here.data/texts/
folder, since my dataset has only oneperson
category, the content of this json file is:[["person"]]
BBox
coordinates and does not have correspondingcaptions
, I should use Normal Finetuning. The configuration file I want to use is:configs/finetune_coco/yolo_world_v2_s_bn_2e-4_80e_8gpus_mask-refine_finetune_coco.py
, but unfortunately, I did not find this pre-trained weight on HF, so I used the configuration fileconfigs/finetune_coco/yolo_world_v2_s_vlpan_bn_2e-4_80e_8gpus_mask-refine_finetune_coco.py
.During the training process, there will be some prompts in the terminal:
I am not sure whether this warning will affect the training.
After that, during the training process, I found that the Loss is a bit too large, as the number of epochs increases (after 77 epochs),
loss: 372.6489 loss_cls: 243.9510 loss_bbox: 63.6398
becameloss: 109.2642 loss_cls: 24.2213 loss_bbox: 38.7833
. I would like to ask if this loss is normal?My understanding of VL-PAN is that "VL-PAN is used to handle the linking of text data and image data," I do not know if my understanding is correct.
For easy reading, I have summarized my questions as follows:
configs/finetune_coco/yolo_world_v2_s_bn_2e-4_80e_1gpu_finetune_coco128.py
, specifically:load_from = '../FastDet/output_models/pretrain_yolow-v8_s_clipv2_frozen_te_noprompt_t2i_bn_2e-3adamw_scale_lr_wd_32xb16-100e_obj365v1_goldg_cc3mram250k_train_lviseval-e3592307_rep_conv.pth'
. I did not find it in HF.mask-refine
, but my dataset does not havemask
annotation, can fine-tuning be completed like this?Thank you very much for answering my questions😊!