megvii-research / MOTR

[ECCV2022] MOTR: End-to-End Multiple-Object Tracking with TRansformer
Other
622 stars 93 forks source link

Demo with init boxes #21

Closed gurkirt closed 2 years ago

gurkirt commented 3 years ago

Hello, Thank you for this amazing work.

Is it possible for your method to accept bounding box initialisation?

Z-Yh-June commented 3 years ago

Hello!

when i run demo ,i found this problem:

No such file or directory: 'exps/e2e_motr_r50_joint/motr_final.pth'

so,where is motr_final.pth,or how can i download it.

please help me ,thank u

vaesl commented 2 years ago

Hello, Thank you for this amazing work.

Is it possible for your method to accept bounding box initialisation?

Good question! I feel it is possible. Since each reference point in D-DETR is the center point of bounding box, you can have a try by modifying this line to replace the reference point with the calculated center points of initilized bounding boxes.

vaesl commented 2 years ago

Hello!

when i run demo ,i found this problem:

No such file or directory: 'exps/e2e_motr_r50_joint/motr_final.pth'

so,where is motr_final.pth,or how can i download it.

please help me ,thank u

Hi~Please download the trained model from the link following the description in README and then put the pretrained weight into the directory.

TerranLord commented 2 years ago

MOTR is not designed with bbox initialzation. If you really need that, personally, I think you can follow the Deformable-DETR with Two-stage setting where bboxes outputted by the first stage are mapped into queries for the second stage(transformer decoder). Specifically, once you would like use init bboxes rather than model detection, you can replace the learnable queries(the detection queries of length 300, used to detect new-born object) by the queries mapped from your init bboxes.

In this way, MOTR could take your init bboxes rather than bboxes detected by itself as new-born objects while keep the tracking mechanism almost unchanged.

However, you should re-train MOTR to receive queries converted from init bboxes. And the label assignment strategy should be carefully and correspondingly adapted. For example, a GT object is fully missed in your init bboxes, but the one-to-one matching still requires a dt box assigned to this missed GT, casuing the learning of network difficult.