-
When I run the train_scanrefer.sh, the result in the log of output is extremely bad, I don't konw why.
last_ position alignment Acc0.25: Top-1: 0.00032, Top-5: 0.00294, Top-10: 0.00936
last_ posit…
-
Hi, I think your work is very meaningful to me, but I encountered some issues while trying to replicate it.Are you using the pre-trained weights from https://huggingface.co/CH3COOK/LL3DA-weight-releas…
-
Dear authors,
I am wondering why the paper is said that Vote2Cap is tested on ScanRefer, not Scan2cap benchmark.
As long as I understand, ScanRefer takes pointclouds with a text query as inputs and …
-
We augment [ScanRefer](https://daveredrum.github.io/ScanRefer/) to create a dataset with 3 types of description-scene pairs: a) Zero Target; b) Single Target; and c) Multiple Targets, indicating zero,…
-
The two popular datasets [ScanRefer](https://daveredrum.github.io/ScanRefer/) and [ReferIt3D](https://referit3d.github.io/) connect natural language to real-world 3D data. In this paper, we curate a l…
-
Hi Dave,
The data browser website seems down.
Best,
Runsen
-
Thanks for sharing the work. I notice that the model can output coordinates of the 3D bounding boxes throught numerical values. How to access this data related to 3D grounding tasks?
-
I run the training code with command provided in README, the results are lower than that in the paper.
| Accuracy | SAT | Reproduce |
| :-------: | :--: | :-------: |
| Nr3d | 49.2 | 46.…
-
How to organize nr3d dataset as scanrefer, could you release the script for organizing nr3d dataset?
-
We propose SceneVerse, the first million-scale 3D vision-language dataset with 68K 3D indoor scenes and 2.5M vision-language pairs. SceneVerse contains 3D scenes curated from diverse existing datasets…