Open Nitaym opened 1 month ago
Hey @amyeroberts, are you the relevant person for this bug?
I have further questions, if possible:
Regarding labels - What should the "class_labels" tensor be filled in? Where should I get the right class indices from? Since this is an open-set detection model, I assume there's not a simple class index dictionary.
Is there example code somewhere for fine-tuning this GroundingDino model with huggingface / custom datasets?
Thanks! Nitay
cc @EduardoPach
Hey @amyeroberts, are you the relevant person for this bug?
I have further questions, if possible:
Regarding labels - What should the "class_labels" tensor be filled in? Where should I get the right class indices from? Since this is an open-set detection model, I assume there's not a simple class index dictionary.
Is there example code somewhere for fine-tuning this GroundingDino model with huggingface / custom datasets?
Thanks!
Nitay
TL;DR
I will work to fix this during this week :)
Hey, thanks for the opening the issue! The implementation of GroundingDinoLoss is not actually correct and when adding the model I didn't focused that much on making it right as the original repo doesn't have training code or the loss calculation.
That being said I found an issue in the original repo where authors point to other repos that implement the training for Grounding DINO so I will use that and check with the paper to fix this :)
Thanks @EduardoPach!
I'll be happy to assist as needed. Could you point me to the reference implementations you've mentioned?
Any update @EduardoPach?
Any update @EduardoPach?
I have added the corrections (haven't created the PR yet) I just need to test them know. I will probably do that during the weekend
System Info
transformers==4.40.2 Python 3.10.14 Ubuntu WSL under Windows 10
Who can help?
@amyeroberts
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
I've been trying to fine tune GroundingDino with transformers' GroundingDinoForObjectDetection. To ease things I've been using batch_size = 1. (I haven't tried with any other batch sizes)
When running the model, I got this exception:
(There were indeed 3 bounding boxes in the label data)
Expected behavior
Loss should be calculated with no errors