realgump / MvMHAT

MvMHAT: Self-supervised Multi-view Multi-Human Association and Tracking (ACM MM 2021, Oral Paper)
33 stars 8 forks source link

Intuition of order of association #19

Closed lunaryle closed 1 month ago

lunaryle commented 1 month ago

Hi @realgump, thank you for your great work mvmhat. When you apply multi view tracking with single view trackers, you applied spatial association first(at frame_matching() and deep_sort/linear_assignment.py). What is the intuition of applying spatial association first? Would this setting be appropriate for non-overlapping camera settings, too? I wonder if you have experimented ablation study regarding the order of association.

realgump commented 1 month ago

In my opinion, the tracking problem is typically an online task. For time-synchronized cameras, at time 0, there is only one frame from each camera. Therefore, we must apply spatial association and assign an ID to each person to initialize their tracklet. At time t > 0, as described in Algorithm 1, Lines 6-9 of Self-supervised Multi-view Multi-Human Association and Tracking, we first apply temporal association, followed by spatial association.

Any person unmatched during spatial association will be assigned a new ID, thus the algorithm suitable for non-overlapping camera settings as well.

lunaryle commented 1 month ago

@realgump, thank you for your detailed explanation. It fixed the first frame mis-matching problem.

One more thing I am uncertain is the re-matching step(RENEW_TIME in config). of frame_callback() at deep_sort/update.py.

  def frame_callback(self, frame_idx):
      if C.RENEW_TIME:
          re_matching = frame_idx % C.RENEW_TIME == 0
      else:
          re_matching = 0
     # ... continued

From your code, re-matching step initializes matched tracks to an empty list, but I could not grab the implications. How is RENEW_TIME different from max_age in general single view trackers?