IDEA-Research / GroundingDINO

[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
https://arxiv.org/abs/2303.05499
Apache License 2.0
6.72k stars 681 forks source link

Visualizing Grounding Dino Model Predicted Bounding Box in Voxel51 Tool ( https://voxel51.com/). Bounding Box position is changing #259

Open solomonmanuelraj opened 11 months ago

solomonmanuelraj commented 11 months ago

Hi Team,

Thanks for your help.

        boxes, logits, phrases = predict(
                model=model,
                image=image,
                caption=TEXT_PROMPT,
                box_threshold=BOX_TRESHOLD,
                text_threshold=TEXT_TRESHOLD
        )

  My assumption is boxes contains the [nq,4] entries. CenterX, CenterY, W, H in the normalized one in the form of values 0 to 1. 

  import fiftyone as fo
  detections = []

  for box, score, label in zip(boxes, scores, phrases):
        box = [round(i, 2) for i in box.tolist()]
        detections.append(
                        fo.Detection(
                            label=label,
                            bounding_box=box,
                            confidence=score
                        )
            )
  # Save predictions to dataset
  sample["gdino_pred"] = fo.Detections(detections=detections)
  sample.save()

In the voxel51 tool when i see the output there is a change in the position.

attached the sample image as well what is the reason and how to change this problem? for OWL-ViT model output we could visualize it well.

only i have problem with Grounding Dino model output.

I used BDD100k image data as input for this ( 1280x720 RGB image)

waiting for your response.

solomonmanuelraj commented 11 months ago

Screenshot attached