Closed durbin-164 closed 5 months ago
Hi @durbin-164
Yes, we used off-the-shelf object detectors like MVDeTr and YOLO in our experiments. Our ReST model is designed as a tracker and needs detection as input first. You can both use ground-truth detection or detection from other masterpiece detectors.
Hi @chengche6230
Thanks for your quick reply. So, I need to first make a JSON file with any object detection model and then run the ReST tracker, right?
Is it possible to track in real-time, such as detect frame by frame and also track frame by frame not the whole video at a time?
Also, I tried to train the Wildtrack model. After separate training SG and TG, I got two models and used them for the test but I could not achieve good results. But your pre-train model works perfectly for the test. What could be the possible problem that I missed, could you help me?
For my trained model, I get this type of evaluation results which is not working at all. IDF1 IDP IDR Rcll Prcn GT MT PT ML FP FN IDs FM MOTA MOTP IDt IDa IDm 0 1.4% 1.4% 1.4% 100.0% 100.0% 9 9 0 0 0 0 612 0 1.4% 0.000 0 612 0 1 1.4% 1.4% 1.4% 100.0% 100.0% 8 8 0 0 0 0 544 0 1.4% 0.000 0 544 0 OVERALL 1.4% 1.4% 1.4% 100.0% 100.0% 17 17 0 0 0 0 1156 0 1.4% 0.000 0 1156 0
Hi,
Yes, use the detection as input for the tracker. In our work, we focus on the data association part rather than designing an end-to-end model. It would be great if you could follow up on our research and create a graph-based end-to-end tracker.
As for the training issue, maybe you can try to do the data augmentation, i.e. generate more diverse graphs for both SG & TG. Take TG as an example, combine more frames and different combinations of cameras during training.
Thanks a lot for helping me.
Hi @chengche6230
I have found in the paper you use MVDeTr for inference. Could you please give some examples of how can I use ReST for inference in which video have no ground truth?
Thanks for helping.