-
### Feature Request: Implementing Masked Video Segmentation with Object Detection - GroundingSAM with Overeasy
**Description:**
I would like to request the integration of masked segmentation from …
-
We combine Grounding DINO, Grounding DINO 1.5 and SAM 2 for tracking any object in the input video and we've open-sourced our code here: [Grounded SAM 2](https://github.com/IDEA-Research/Grounded-SAM-…
-
*************** EP Error ***************
EP Error /onnxruntime_src/onnxruntime/python/onnxruntime_pybind_state.cc:754 std::unique_ptr onnxruntime::python::CreateExecutionProviderInstance(const onnxru…
-
### Question
Hi!
I'm trying out some of the zero shot capabilities and I've been working with the owlv2 but I was wondering, is support for yolo-world and grounding Dino coming? They seem to be f…
-
Hello!
The detect objects:
```
detections = grounding_dino_model.predict_with_classes(
image=image,
classes=enhance_class_name(class_names=CLASSES),
box_threshold=BOX_TRESHOLD,
…
-
**_您好,我用grounding-dino微调自己的数据集得到的模型进行视频推理(demo/video_demo.py)时报错了,问题如下_**
(mmenv_new) (base) [zhoujianbang@ai mmdetection-dev-3.x]$ /ssd2/zhoujianbang/envs/mmenv_new/bin/python /ssd/zhoujianbang/proj…
-
我在观察Qwen2-vl的SFT数据格式时发现似乎和Qwen-vl的格式差别比较大,重点是没有给出定位框的标注示例了,还有就是0-1000的归一化问题,再Qwen2-vl中还需要操作么?求解答
When observing the SFT data format of Qwen2-vl , I found that there seems to be a big difference with …
mokby updated
1 month ago
-
Aiming to link natural language descriptions to specific regions in a 3D scene represented as 3D point clouds, 3D visual grounding is a very fundamental task for human-robot interaction. The recogniti…
-
As stated in the question, when the object I want to detect does not appear at the beginning of my video, the code will report an error when running. What method should I use to eliminate this hidden …
-
Thanks for your brilliant work!
I'm wondering if the model can detect all objects, such as a 'grounding dino'?