longzw1997 / Open-GroundingDino

This is the third party implementation of the paper Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection.
MIT License
457 stars 71 forks source link

question about training VG data #53

Open Nathan-Li123 opened 10 months ago

Nathan-Li123 commented 10 months ago

I've noticed an issue during training that when using my custom dataset (in VG format), the model's performance significantly degrades when different objects share the same description. How can I address this problem? If I treat the description sentences as class names and convert the custom dataset to an object detection (od) format, would that address this issue? Looking forward to your reply, thanks!

Nathan-Li123 commented 10 months ago

I have now realized that the issue is not related to "different objects sharing the same description" mentioned earlier, but rather to poor training performance when using a smaller VG dataset. Could you confirm if the learning rates and other parameters used for finetuning with the VG format dataset are the same as those indicated in the config/cfg_odvg.py file? Or that's just the suitable parameters for the training with OD format datasets.

Knivacke commented 9 months ago

Hey! Did you get this to work? I'm trying to fine-tune the model for my master's thesis, but am honestly having a hard time getting it to work.