3d-visual-grounding Search Results

linukc/SUN3D_DATASETS #27

ScanERU

Aiming to link natural language descriptions to specific regions in a 3D scene represented as 3D point clouds, 3D visual grounding is a very fundamental task for human-robot interaction. The recogniti…

linukc updated 1 month ago

linukc/SUN3D_DATASETS #32

RIORefer

Using crowdsourcing services, we collected 63,602 descriptions for approximately 249 unique objects across 1,380 scans as a RIORefer dataset. [Paper](https://arxiv.org/pdf/2305.13876) [Code](https:…

linukc updated 1 month ago

Open3DA/LL3DA #25

How to get grounding data?

Thanks for sharing the work. I notice that the model can output coordinates of the 3D bounding boxes throught numerical values. How to access this data related to 3D grounding tasks?

Germany321 updated 1 month ago

linukc/SUN3D_DATASETS #30

MMScan

Multi-modal 3D dataset encompasses 1.4M meta-annotated captions on 109k objects and 7.7k regions as well as over 3.04M diverse samples for 3D visual grounding and question-answering benchmarks. The ov…

linukc updated 1 month ago

ch3cook-fdu/Vote2Cap-DETR #15

Question for ScanRefer benchmark, not Scan2cap

Dear authors, I am wondering why the paper is said that Vote2Cap is tested on ScanRefer, not Scan2cap benchmark. As long as I understand, ScanRefer takes pointclouds with a text query as inputs and …

jkstyle2 updated 5 months ago

changh95/WeeklySpatialAI #4

2024.07.31 - #2

# Interesting papers ## 카메라 포즈 찾기의 전쟁? - [Pan 2024 - Global Structure-from-Motion Revisited](https://lpanaf.github.io/eccv24_glomap/) - COLMAP의 저자 참여. COLMAP의 global mapping 파트 개선. 일주일 걸리는 …

changh95 updated 1 month ago

amusi/ECCV2024-Papers-with-Code #36

欢迎分享ECCV 2024 论文和代码 / Welcome to share the paper and code of…

[The format of the issue] Paper name/title: Project link: Paper link: Code link:

amusi updated 1 week ago

huggingface/transformers #24671

Is there any plan to add kosmos-2 to the transformers.

### Model description Kosmos-2 is a grounded multimodal large language model, which integrates grounding and referring capabilities compared with Kosmos-1. The model can accept image regions select…

BIGBALLON updated 10 months ago

deepseek-ai/DeepSeek-VL #6

Fine-tuning Script

Congratulations to DeepSeek for the wonderful work. I wonder if there is a script for fine-tuning DeepSeek-VL? Thanks!

TechxGenus updated 6 months ago

unizard/AwesomeArxiv #86

[2017.11.29] Vision In NIPS2017

**Proceedings** https://papers.nips.cc/book/advances-in-neural-information-processing-systems-30-2017 https://github.com/catpanda/NIPS_2017 **PaperLists (#Papers 679)** https://www.dropbox.com/s…

unizard updated 6 years ago

70 results for 3d-visual-grounding

70 results
for 3d-visual-grounding