How to encoder the time (frame id) information?

sallymmx commented 2 years ago

How do you set the time information to construct the point cloud (Nx4) in the final temporal dimension as you said in your paper? There may be several choices: (1) The true frame id like 400 for the 400-th frame; (2) Normalized frame id like 0-1 for all frame ids, then 0.5 will indidate the half frames (200-th frame for a 400-frame video) (3) just use 0 for all the templates and 1 for all search frames.

Ghostish commented 2 years ago

Interesting question! Due to the limited space, we omit the detail about the temporal encoding in our main paper. Here are some key points that guide our design: 1) Since we focus on online tracking, where the total amount of frames is not known, normalized frame id (the choice 2) is not applicable for the task. 2) MM-Track only considers the relative target motion between two consecutive frames, and thus we do not need to use the true frame id to encode the time. Instead, using a relative time encoding is more suitable for MM-Track ( choice 3 ). 3) In our implementation, we use 0.0/0.1 instead of 0/1 to encode the relative timestamps.

sallymmx commented 2 years ago

Interesting question! Due to the limited space, we omit the detail about the temporal encoding in our main paper. Here are some key points that guide our design:

Since we focus on online tracking, where the total amount of frames is not known, normalized frame id (the choice 2) is not applicable for the task.

MM-Track only considers the relative target motion between two consecutive frames, and thus we do not need to use the true frame id to encode the time. Instead, using a relative time encoding is more suitable for MM-Track ( choice 3 ).

In our implementation, we use 0.0/0.1 instead of 0/1 to encode the relative timestamps.

Thanks for your reply. "we use 0.0/0.1 instead of 0/1 to encode the relative timestamps." But if the first template is used, then the temporal value for the template is still all set to 0?

Ghostish commented 2 years ago

Interesting question! Due to the limited space, we omit the detail about the temporal encoding in our main paper. Here are some key points that guide our design:

Since we focus on online tracking, where the total amount of frames is not known, normalized frame id (the choice 2) is not applicable for the task.

MM-Track only considers the relative target motion between two consecutive frames, and thus we do not need to use the true frame id to encode the time. Instead, using a relative time encoding is more suitable for MM-Track ( choice 3 ).

In our implementation, we use 0.0/0.1 instead of 0/1 to encode the relative timestamps.

Thanks for your reply. "we use 0.0/0.1 instead of 0/1 to encode the relative timestamps." But if the first template is used, then the temporal value for the template is still all set to 0?

Hi, I am not sure what you mean when you say "template". MM-Track is a motion-centric method. During the whole pipeline, we don't have "templates" or "search areas". We only have previous frames and the current frames.

sallymmx commented 2 years ago

Interesting question! Due to the limited space, we omit the detail about the temporal encoding in our main paper. Here are some key points that guide our design:

Since we focus on online tracking, where the total amount of frames is not known, normalized frame id (the choice 2) is not applicable for the task.

MM-Track only considers the relative target motion between two consecutive frames, and thus we do not need to use the true frame id to encode the time. Instead, using a relative time encoding is more suitable for MM-Track ( choice 3 ).

In our implementation, we use 0.0/0.1 instead of 0/1 to encode the relative timestamps.

Thanks for your reply. "we use 0.0/0.1 instead of 0/1 to encode the relative timestamps." But if the first template is used, then the temporal value for the template is still all set to 0?

Hi, I am not sure what you mean when you say "template". MM-Track is a motion-centric method. During the whole pipeline, we don't have "templates" or "search areas". We only have previous frames and the current frames.

So you don't use the first frame in MM-Track？

Ghostish commented 2 years ago

Interesting question! Due to the limited space, we omit the detail about the temporal encoding in our main paper. Here are some key points that guide our design:

Since we focus on online tracking, where the total amount of frames is not known, normalized frame id (the choice 2) is not applicable for the task.

MM-Track only considers the relative target motion between two consecutive frames, and thus we do not need to use the true frame id to encode the time. Instead, using a relative time encoding is more suitable for MM-Track ( choice 3 ).

In our implementation, we use 0.0/0.1 instead of 0/1 to encode the relative timestamps.

Thanks for your reply. "we use 0.0/0.1 instead of 0/1 to encode the relative timestamps." But if the first template is used, then the temporal value for the template is still all set to 0?

Hi, I am not sure what you mean when you say "template". MM-Track is a motion-centric method. During the whole pipeline, we don't have "templates" or "search areas". We only have previous frames and the current frames.

So you don't use the first frame in MM-Track？

No. To make it clear, we only use the first frame to start the tracking during the inference (i.e, the first frame is acted as the first "previous frame"). We do not reuse the first frame in the subsequent tracking.

Ghostish / Open3DSOT

How to encoder the time (frame id) information? #25