How to Set Vision Prompt Image and Target Image Using API

IDEA-Research / T-Rex

API for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy

https://deepdataspace.com/home

Other

1.98k stars 120 forks source link

How to Set Vision Prompt Image and Target Image Using API #68

Closed wangnan2021 closed 2 weeks ago

wangnan2021 commented 2 weeks ago

In the provided online demo, the vision prompt image and the image to be detected can be two different images. By giving a prompt on the vision prompt image, you can then detect it on the other image. How can this functionality be implemented in the API?

Mountchicken commented 2 weeks ago

Hi @wangnan2021 This function can be called through generic inference API. See example here: https://github.com/IDEA-Research/T-Rex/blob/trex2/demo_examples/generic_inference.py

wangnan2021 commented 2 weeks ago

Thank you for your response. Could you please confirm if T-Rex2 is unable to output instance masks? If so, is it possible to use the API to call T-Rex1 instead?

Mountchicken commented 2 weeks ago

Hi @wangnan2021 T-Rex2 doesn't support mask output. You need to use an additional interactive segmentation model like SAM to get masks, same for T-Rex1