zlai0 / MAST

MAST: A Memory-Augmented Self-supervised Tracker (CVPR 2020)
https://zlai0.github.io/MAST/
273 stars 32 forks source link

Some details in the paper #1

Closed irfanICMLL closed 4 years ago

irfanICMLL commented 4 years ago

Thank you for sharing such a great job! I have some questions about the implementation details for this paper.

  1. What is the 'soft and hard propagation' In Table 6? Does it mean the propagation function during inference? I am a little confused about the details of this table.
  2. Can you share the function you use to achieve image-feature alignment? Can I use an odd input size and 'align_corner=True' to achieve the feature align?
  3. How is the result when removing the memory-augment tracker?

I would be so grateful if you can help me with these details. Looking forward to the coming code.

zlai0 commented 4 years ago
  1. Yes, that right. Hard just means you apply an argmax to quantize the results. Specifically,
    
    _output = model(rgb_0, anno_0, rgb_1, ref_index, i+1)
    _output = F.interpolate(_output, (h,w), mode='bilinear')

Hard

output = torch.argmax(_output, 1, keepdim=True).float()

Soft

output = _output



2. Just do  `x = image.float()[:,:,::4,::4]` because the center of CNN filters starts from the top left corner.

3. About 59 (J&F Mean). Refer to table 5 of the paper.