Querying Coordinate Output Capabilities in T-Rex Project Web Demo

IDEA-Research / T-Rex

[ECCV2024] API code for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy

https://deepdataspace.com/blog/T-Rex

Other

2.28k stars 147 forks source link

Querying Coordinate Output Capabilities in T-Rex Project Web Demo #42

Closed CarrieX6 closed 7 months ago

CarrieX6 commented 7 months ago

Hello, I greatly appreciate the work you've put into the T-Rex(2) project; it has been very inspiring to me. I am interested in knowing if it is possible to directly obtain the coordinates of the bounding boxes or the center points when using the web demo for testing. Or, would I need to fork the project and run it on my own to achieve this functionality?

Mountchicken commented 7 months ago

Hi @CarrieX6 Thanks for your interest in our work. In our Gradio demo, you can get all the bounding box outputs in the coco format: 25171712038001_ pic

CarrieX6 commented 7 months ago

Thank you very much for your response. It has resolved my query.

CarrieX6 commented 7 months ago

Hi @CarrieX6 Thanks for your interest in our work. In our Gradio demo, you can get all the bounding box outputs in the coco format:

I have also encountered an issue with outputting specific category detections with a given input (the 'category_id'). It seems that there's no option for outputting appointed categories at [https://huggingface.co/spaces/Mountchicken/T-Rex2](Gradio demo). Is there a solution available in the API for this? Thank you very much for your assistance.

Mountchicken commented 7 months ago

For both the Gradio demo and the DDS demo, we do not support multi-category inference for now. However this is available in the API. Check this example code for multi-category inference in the interactive visual prompt mode. Check this comment for multi-category inference in the generic visual prompt mode.

CarrieX6 commented 7 months ago

For both the Gradio demo and the DDS demo, we do not support multi-category inference for now. However this is available in the API. Check this example code for multi-category inference in the interactive visual prompt mode. Check this comment for multi-category inference in the generic visual prompt mode.

Got it. Thanks a lot for your response. Have a nice day!