-
To the Authors
This is a very interesting and good work on visual grounding tasks with a Query-based detector. The paper is also well written and clear. Super interesting results with GLIGEN as we…
-
If I want to run G-DINO multiple times on one prompt alone,
can I save sometime in the inference somhow ?
or How could I distill/decrease the model wights/inference time when I know I have 1 prompt …
-
The inference is currently performed on a single image and single text prompts. Is it possible to batch inputs (images or text prompts)?
-
please add Promptgen 1.5 support
-
Check the instructions on the MDETR repo to find the proper validation code.
1. Adapt the test data to the evaluation code
2. Make sure to use the right model (how to call the model, etc.)
-
I got two questions as following when I ran the demo "grounded_sam2_florence2_image_demo.py":
1.
Traceback (most recent call last):
File "/opt/conda/envs/compression/lib/python3.10/site-packages…
-
what an exciting job!
However, the functions displayed in online demo or local-hosted demo are the same. Only images can be input, and the model provides boxes and caption.But, the paper mentions ma…
-
- sampling and aliasing should not be capitalized, they're regular words.
- on sampling: "We sample because there is a cost to data which is memory." This is not quite right technically. Sampling is …
rdgao updated
5 years ago
-
Hello, authors. I would like to ask two questiones. 1. How to deal with box query feature and point query feature after deformable cross-
attention, contact? 2. How to get corresponding text prompts…
-
Hello everyone, thank you very much for your contribution. I appreciate the effort and consistency in uploading the code for such many models and maintaining this repository.
I saw Kosmos-2 and I q…