Can VILA do grounding jobs?

Hello, I ask vila for giving the bbox of the objects in the photo and vila do reply me the bbox. Then I used code see if it is correct.

from PIL import Image, ImageDraw

# open image
image_path = 'cup.jpg'
image = Image.open(image_path)
draw = ImageDraw.Draw(image)

# bbox vila reply
normalized_bbox = [0.34, 0.7, 0.46, 0.78]

# denormalized
image_width, image_height = image.size
bbox = [
    normalized_bbox[0] * image_width,
    normalized_bbox[1] * image_height,
    normalized_bbox[2] * image_width,
    normalized_bbox[3] * image_height
]

# draw bounding box
draw.rectangle(bbox, outline="red", width=3)

# save figure
output_path = './cup_with_bbox.jpg'
image.save(output_path)

I can see the bbox with the correct size but with the wrong center point. How can I turn the normalized_bbox to the photo?

NVlabs / VILA

Can VILA do grounding jobs? #128