Open honghd16 opened 2 months ago
There are some pre-built files about the detection result in the onedrive link, they can be used as the cache to accelerate the detection process.
Thanks for the quick response. I have downloaded and used the real_time_detection.db. But I guess since I am trying to use CLIP features for training, the agent may go to some additional viewpoints that are not pre-recorded and require real-time detection. I am just wondering if there is a way to accelerate the speed for owlvit-large detection. Could I know how long it takes you for the first training?
Honestly, I don't quite remember the first training time. I pre-detected all the viewpoints in the ground truth path and then added a real-time detection mechanism. Training this released repo with the pre-built files only takes one or two days with two GPUs. Evaluation takes less than half an hour. Does it cost you extra training time when you use this repo? Maybe you could try using smaller detectors or more GPUs, or temporarily remove the real-time detection mechanism in evaluation before your model achieves higher performance.
Thank you for your advice! I haven't run the original repo since I need to use the clip feature for a fair comparison with other methods. I am considering using smaller detectors but I am not sure whether it will have a significant impact on the performance. Do you have any experience with using different detectors for OVER-NAV?
I did not run the repo with other detectors since the training time was acceptable to me. But I did draw the detection boxes on the images and they seem fine.
Okay, thank you so much for the help. I will try your suggestions.
Hi, thank you for sharing this excellent work!
I am currently running OVER-NAV-IVLN on my server, but I've encountered an issue with the detection process for structured_memory. The process has been running for an entire day, and it still hasn’t completed a single round of evaluation. I wanted to ask if this extended runtime is expected, or if there are any recommended methods to accelerate this step.
Any help would be appreciated. Thanks!