Regarding the issue of only providing language and img for testing

Hello, I would like to ask if it is possible to input only the language description and not the bounding box of the first frame during the testing phase? For example, entering a video and its language description without providing any truth values when evaluating. I try three methods: TEST_METHOD: "TRACK" # choice in ['GROUND', 'TRACK', 'JOINT'] But it did't work. It still wants me to provide the groundtruth file. Exception: Could not read file D:/JointNLT-main-2/output//OTB_videos/Threecar/groundtruth.txt

lizhou-cs / JointNLT

Regarding the issue of only providing language and img for testing #19