Hello, I would like to ask if it is possible to input only the language description and not the bounding box of the first frame during the testing phase? For example, entering a video and its language description without providing any truth values when evaluating.
I try three methods: TEST_METHOD: "TRACK" # choice in ['GROUND', 'TRACK', 'JOINT']
But it did't work. It still wants me to provide the groundtruth file.
Exception: Could not read file D:/JointNLT-main-2/output//OTB_videos/Threecar/groundtruth.txt
Hello, I would like to ask if it is possible to input only the language description and not the bounding box of the first frame during the testing phase? For example, entering a video and its language description without providing any truth values when evaluating. I try three methods: TEST_METHOD: "TRACK" # choice in ['GROUND', 'TRACK', 'JOINT'] But it did't work. It still wants me to provide the groundtruth file. Exception: Could not read file D:/JointNLT-main-2/output//OTB_videos/Threecar/groundtruth.txt