time required for processing the video in different scenarios

NVlabs / FoundationPose

[CVPR 2024 Highlight] FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects

https://nvlabs.github.io/FoundationPose/

Other

955 stars 99 forks source link

time required for processing the video in different scenarios #64

Closed monajalal closed 1 month ago

monajalal commented 1 month ago

I would like to know what is the time it takes for you approximately for similar length video and if there is a way to even make it faster?

input: a 2.8 minutes

took 6.42 minutes if pose estimation only on first frame

took 7.28 minutes if pose estimation only on first frame and setting track_refine_iter to 5

Capture is done with Intel RealSense D435 camera.

wenbowen123 commented 1 month ago

Did you include the IO time? Did you disable all debug loggings? What's the complexity of your mesh? All these can influence. Also you should consider the NVIDIA SDK which provides much faster running time.

monajalal commented 1 month ago

What do you mean by "NVIDIA SDK"? Do you mean https://github.com/NVlabs/FoundationPose/issues/56#issuecomment-2053732961 since it is not yet released.

Thanks for all the pointers. Yes, I should stop the debug times. I will report back later.

monajalal commented 1 month ago

Thanks a lot for your input.

After removing --debug 3:

input: a 2.8 minutes

took 2.1 minutes if pose estimation only on first frame

took 3.14 minutes if pose estimation only on first frame and setting track_refine_iter to 5

I took the time when the visualization starts on the screen. I am very happy with processing time. Thanks.