lizhou-cs / JointNLT

The official implementation for the CVPR 2023 paper Joint Visual Grounding and Tracking with Natural Language Specification.
MIT License
57 stars 4 forks source link

Regarding the issue of only providing language and img for testing #19

Open mengmimi opened 11 months ago

mengmimi commented 11 months ago

Hello, I would like to ask if it is possible to input only the language description and not the bounding box of the first frame during the testing phase? For example, entering a video and its language description without providing any truth values when evaluating. I try three methods: TEST_METHOD: "TRACK" # choice in ['GROUND', 'TRACK', 'JOINT'] But it did't work. It still wants me to provide the groundtruth file. Exception: Could not read file D:/JointNLT-main-2/output//OTB_videos/Threecar/groundtruth.txt