google-deepmind / tapnet

Tracking Any Point (TAP)
https://deepmind-tapir.github.io/blogpost.html
Apache License 2.0
1.17k stars 115 forks source link

Visualization #62

Open xxiMiaxx opened 9 months ago

xxiMiaxx commented 9 months ago

Dear Authors, thank you for this brilliant work.

I noticed at TAPIR website you have the following visualizations:

image

you mentioned that you have obtained it by segmenting the scene into foreground and background, then you removed the background points to reveal how the objects are moving through a scene.

I was wondering is the script/function that performs that available in the repo?

I just think that this particular visualization will deliver a stronger point on my videos.

Thank you.

best, Mia

cdoersch commented 9 months ago

Unfortunately the code for that visualization is something I hacked together very fast, and it isn't in a state where it can be released easily. The basic steps are:

1) Compute dense trajectories by sampling query points on a grid from one (or more) frames, typically from somewhere in the middle of the video 2) Choose a reference frame, and use RANSAC to compute a homography between the reference frame and every other frame, explaining as many points as possible that are visible on both frames (points not visible in either frame are ignored) 3) Remove points that are inliers with respect to a large fraction of the homographies.
4) For everything else, transform the 'tails' on every frame using the homographies, such that points which exactly follow the camera will all be plotted at the same point (although in practice, these will be 'inliers' and should get removed in the prior step). 5) plt.gca().add_collection(matplotlib.collections.LineCollection(pts,color=color_alpha)) is what we used for plotting the lines, as it lets you control both color and opacity for every line segment. I haven't found another python solution that can plot points fast enough to be practical.

Obviously this can be improved by adding a refinement step that optimizes all of the homographies together; it would get rid of the jitter that you see in many of these videos (I suspect other labs do this with their visualizations).

cdoersch commented 8 months ago

This code is now released! See https://colab.research.google.com/github/deepmind/tapnet/blob/master/colabs/tapir_rainbow_demo.ipynb and let me know if it works for you.