Closed planb788 closed 1 month ago
Hi @planb788, Thank you for your attention to our TAPTR and your kind supplement of environment requirements. Recently I've been busy with another work and can not solve these issues immediately. But I have an idea to reduce the memory usage, and you can have a try by yourself:
In the forward function of TAPTR:https://github.com/IDEA-Research/TAPTR/blob/503a3339eb560408dca753241750561b9c3b8bd0/models/dino/taptr.py#L353 we process all of the images at first and store the resulting feature maps in GPU memory. You can make some changes here to obtain the feature map of one frame only when tracking points on this frame, and delete the feature map after using it.
Hi, @planb788, I happened to have the same need this morning. I added a memory-efficient mode to TAPTR. By simply setting this flag to true, the memory requirement can be reduced to only 8GB. https://github.com/IDEA-Research/TAPTR/blob/0d3902db006ecc32f74cc42834af23d005068c07/models/dino/taptr.py#L116
I will close this issue, if you have any questions, feel free to reopen this issue.
Hi, @LHY-HongyangLi. Thank you for your outstanding work!
I’m curious—does memory_efficient_mode
impact the accuracy of your model? I noticed that the evaluation of tapvid_davis_first
with memory_efficient_mode
is slightly lower than co-tracker.
I look forward to your response!
It seems that it will indeed hurt the performance, but I have not figure out the reason. 🤕
After actual deployment and use, the GPU occupancy easily exceeds 20G, while the video length is only two to three seconds.