Open Thomas-Malchers opened 11 months ago
Thanks for reporting, will look into it.
In the meantime could you try using OwlViTProcessor
instead of Owlv2Processor
and see whether you get better results?
Thanks for the quick reply. In the meantime, i had a look at the OwlViT v1 Processor and when interchanging the processor in this notebook: https://github.com/huggingface/notebooks/blob/main/examples/zeroshot_object_detection_with_owlvit.ipynb with the updated v2 models/ processors it seems to work. I assume that the direct access on the results instead of calling results = processor.post_process_object_detection(outputs=outputs, target_sizes=target_sizes, threshold=0.2)
makes the difference.
Edit: Nevermind, my notebook was still in a older state, using the above code, also does not help in solving the problem.
Yeah I've run the original Colab by the authors on the cats image, when visualizing the bounding boxes they visualize them on the preprocessed (padded + resized) image, not on the original image.
So we need to do the postprocessing on the padded image rather than the original one. Will look more into this over the weekend
Hey, I am trying to run your owlvit v2 notebook on my local machine. whenever i am using an image that is not of squared shape, such as the cat example that you also use, the resulting bounding boxes are slightly shifted upwards. I tried multiple images, filetypes, etc. and the problem still persists. Is there some parameter i am missing? The problem also does not seem to be in the plotting itself as the coordinates of the bounding boxes are wrong already.