open-mmlab / mmdetection

OpenMMLab Detection Toolbox and Benchmark
https://mmdetection.readthedocs.io
Apache License 2.0
29.63k stars 9.47k forks source link

SwinT Weight Fine-tuning Leads to Drastic Decline in 'Pole' Object Detection in GroundingDINO #11005

Open zhangnanyue opened 1 year ago

zhangnanyue commented 1 year ago

Dear Author,

Firstly, I would like to express my gratitude for providing the fine-tuning methods for GroundingDINO. Your work is deeply appreciated. However, I've noticed a significant discrepancy in the detection results for the 'pole' object, before and after fine-tuning the SwinT weights. As illustrated in the attached image, after fine-tuning, the model seems unable to detect the 'pole' category in the image. I'm genuinely puzzled by this outcome. Could you perhaps shed some light on why this might be happening?

Furthermore, if I want to specifically fine-tune the model for the 'pole' category, could you suggest a specific method or strategy?

Thank you so much for your time and assistance. Your insights will be invaluable to my work.

Warm regards.

zhangnanyue commented 1 year ago

pole_image_swint pole_image_swint_fine

FengheTan9 commented 1 year ago

hi, i run this finetune on coco and get this error: image

I don’t know if there is something wrong with the data or something. Have you ever encountered it? Thanks !

hhaAndroid commented 1 year ago

@zhangnanyue I suspect there might be a mistake in your configuration. Please provide your configuration details.

zhangnanyue commented 1 year ago

@zhangnanyue I suspect there might be a mistake in your configuration. Please provide your configuration details.

I didn't fine-tune the original Swin-T model. I simply used the weight file you provided: grounding_dino_swin-t_finetune_16xb2_1x_coco_20230921_152544-5f234b20.pth. Then I ran the command: "python demo/image_demo.py pole_image.jpg configs/grounding_dino/grounding_dino_swin-t_finetune_16xb2_1x_coco.py --weights grounding_dino_swin-t_finetune_16xb2_1x_coco_20230921_152544-5f234b20.pth --texts 'pole' --pred-score-thr 0.2".

That's when I noticed it couldn't correctly detect the 'pole' category. However, the original weight file groundingdino_swint_ogc_mmdet-822d7e9d.pth was able to detect the 'pole' category correctly.

hhaAndroid commented 1 year ago

@zhangnanyue https://github.com/open-mmlab/mmdetection/pull/11012

hhaAndroid commented 1 year ago

@zhangnanyue

python demo/image_demo.py pole_image.jpg configs/grounding_dino/grounding_dino_swin-t_finetune_16xb2_1x_coco.py --weights grounding_dino_swin-t_finetune_16xb2_1x_coco_20230921_152544-5f234b20.pth --texts 'pole.' --pred-score-thr 0.2
zhangnanyue commented 1 year ago

@zhangnanyue

python demo/image_demo.py pole_image.jpg configs/grounding_dino/grounding_dino_swin-t_finetune_16xb2_1x_coco.py --weights grounding_dino_swin-t_finetune_16xb2_1x_coco_20230921_152544-5f234b20.pth --texts 'pole.' --pred-score-thr 0.2

It still doesn't work. I ran the command: python demo/image_demo.py pole_image.jpg configs/grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_cap4m.py --weights groundingdino_swint_ogc_mmdet-822d7e9d.pth --texts 'pole.' --pred-score-thr 0.2, which does work. However, when I run the command: python demo/image_demo.py pole_image.jpg configs/grounding_dino/grounding_dino_swin-t_finetune_16xb2_1x_coco.py --weights grounding_dino_swin-t_finetune_16xb2_1x_coco_20230921_152544-5f234b20.pth --texts 'pole.' --pred-score-thr 0.2, it still doesn't function properly.

zhangnanyue commented 1 year ago

hi, i run this finetune on coco and get this error: image

I don’t know if there is something wrong with the data or something. Have you ever encountered it? Thanks !

Perhaps you can check out the latest README document updated by the author (https://github.com/open-mmlab/mmdetection/blob/dev-3.x/configs/grounding_dino/README.md). It includes examples on how to fine-tune for a specific category. That might be helpful to you.

hhaAndroid commented 1 year ago

@zhangnanyue Oops, you misunderstood. The fine-tuned weights are only valid for the "coco" dataset, which you haven't trained on pole. Therefore, it would be appropriate to use pre-trained weights.

zhangnanyue commented 1 year ago

@zhangnanyue Oops, you misunderstood. The fine-tuned weights are only valid for the "coco" dataset, which you haven't trained on pole. Therefore, it would be appropriate to use pre-trained weights.

Thank you for your time and comprehensive explanations. I now grasp your points. Moreover, following the fine-tuning methods you provided, I fine-tuned on my custom 'pole' dataset, and the detection accuracy for the 'pole' object has indeed improved. However, I still have a lingering question: Why did the model's capability to detect 'pole' diminish after fine-tuning on the Coco dataset? What precisely transpired in this process? Is this a manifestation of forgetting?

hhaAndroid commented 1 year ago

@zhangnanyue This is a good question. Fine-tuning is done to achieve significant performance improvements on a specific dataset. If you want to maintain the capabilities of the pre-trained model, it is actually recommended to perform pre-training by incorporating your own dataset instead of fine-tuning. Another viable approach is to enhance the text branch during fine-tuning by introducing certain degrees of augmentation and fixing certain weights. It requires some experimentation to determine the appropriate settings.

zhangnanyue commented 1 year ago

I sincerely appreciate your response, which has clarified my doubts. Moving forward, I will attempt some fine-tuning with fixed weights and hope this proves effective. Once again, thank you for your time and your contributions.

ws1hope commented 1 year ago

@zhangnanyue Oops, you misunderstood. The fine-tuned weights are only valid for the "coco" dataset, which you haven't trained on pole. Therefore, it would be appropriate to use pre-trained weights.

Thank you for your time and comprehensive explanations. I now grasp your points. Moreover, following the fine-tuning methods you provided, I fine-tuned on my custom 'pole' dataset, and the detection accuracy for the 'pole' object has indeed improved. However, I still have a lingering question: Why did the model's capability to detect 'pole' diminish after fine-tuning on the Coco dataset? What precisely transpired in this process? Is this a manifestation of forgetting?

I have a different result with you. I used weight groundingdino_swint_ogc_mmdet-822d7e9d' to fine tune my dataset, and the categories of this dataset have not appeared in Coco before, the result show better, which is normal. When i use the weight finetuned on the coco dataset to leverage the result of my custom dataset, the result decrease little,there has been no ‘diminish‘’’ as you mentioned.