Add points in the middle frames in OnlineTAPIR

PinxueGuo commented 2 months ago

Thank you for the nice work!

May I ask if OnlineTAPIR (torch_causal_tapir_demo.ipynb) is possible to add point in the intermediate frames except for frame 0, just like StandardTAPIR (torch_tapir_demo.ipynb). Not quite sure how to construct causal_state.

yangyi02 commented 2 months ago

For tracking points from another query frame, you can extract the query features from your query frame, then run the OnlineTAPIR inference pipeline.

i.e. query_features = online_model_init(frames[None, query_frame_index:query_frame_index + 1], query_points[None])

The OnlineTAPIR is supposed to running autoregressively from the beginning of the video and there is no need to change the causal_state initialization.

PinxueGuo commented 2 months ago

Thank you for your reply.

Do you mean that the presence of new query points in subsequent frames should be determined during the causal state initialization at the beginning of the video? However, in my setting, it’s impossible to determine at the outset whether new query points will be added in later frames.

For instance, I might select 3 query points in the first frame, but by the 10th frame, I might want to add 2 more query points starting from that frame. So at the 11th frame there should be 5 tracking points, 3 of which start at frame 1, and 2 of which start at frame 10.

Does this mean that if I want to add 2 new points in the 10th frame, I need to discard the query features of the 1st frame and the historical clues from frames 1-9, and instead, re-extract the query features based on the 3 predicted points in the 10th frame (which could introduce errors)? Additionally, since the number of points has increased from 3 to 5, would the shape of the causal state also need to be adjusted by reinitializing it with the 5 points?

google-deepmind / tapnet

Add points in the middle frames in OnlineTAPIR #114