Can I use it to train for Multiple camera Multiple person tracking problem?

zhengthomastang / 2018AICity_TeamUW

The winning method in Track 1 and Track 3 at the 2nd AI City Challenge Workshop in CVPR 2018 - Official Implementation

http://openaccess.thecvf.com/content_cvpr_2018_workshops/w3/html/Tang_Single-Camera_and_Inter-Camera_CVPR_2018_paper.html

550 stars 198 forks source link

Can I use it to train for Multiple camera Multiple person tracking problem? #18

Open KunalArora opened 4 years ago

KunalArora commented 4 years ago

Hello organizers,

Thank you for the code. It's a great work done. I want to do a research for my thesis to create a system to track Multiple people in Multiple camera scenario. I totally believe your code can be extended or your models could be trained to do that. Will you please share some insights if it's possible?

zhengthomastang commented 4 years ago

Sure. The same code can be applied to person scenario. You just need to change the object detector to output person locations.

KunalArora commented 4 years ago

Thank you for the response. But, I believe I need to retrain the whole system, Detector, Re-Id and tracking on specific person datasets, right?

And, as you mentioned I need to change the object detector to output person locations, that means I need to make changes into the detection/tools/infer_simple_txt.py to achieve it as that is been called from run.sh, right?

zhengthomastang commented 4 years ago

The provided pre-trained models for YOLOv2 cannot be used to detect people. You can use the pre-trained models on ImageNet or MS COCO instead. All you need to do is to extract on detected people from the results. We suggest you to try more advanced object detectors like YOLOv3 and Faster R-CNN. The pre-trained models provided with them should be accurate enough. The ReID and tracking parts are not dependent on the object types.

KunalArora commented 4 years ago

Okay, great. So, it means if I am able to change the detector to output person detections, I don't need to make further changes to ReID and Tracker as they follow the input from detection, right.

Any help where I have to make these changes? A more detailed reference would be appreciated
Also, I believe I can use this code for real-time tracking, right

zhengthomastang commented 4 years ago

For object detection, there is nothing much you need to change. All you need is to use the pre-trained models to generate detection results and extract the people objects from them. For ReID, we used transfer learning, i.e., using the pre-trained model to extract features, so there is no need for training, however, we found that using metric learning will lead to better performance. You can refer to our latest paper in CVPR 2019 about the CityFlow dataset. We also have a better single-camera tracker that you can find here: https://github.com/ipl-uw/2019-CVPR-AIC-Track-1-UWIPL.

Since our code has been divided to separate components, you may need to integrate them into a standalone pipeline for real-time tracking.

KunalArora commented 4 years ago

Can you please provide me the actual repository for Multiple camera Vehicle tracking code? This is because the Track 3/1_Multi-Camera Vehicle Tracking and Re-identification folder does not have code in it except a Readme.md

zhengthomastang commented 4 years ago

You can find the link to all the repositories we used here: https://github.com/zhengthomastang/2018AICity_TeamUW/tree/master/Track3

The main repository is this one: https://github.com/AlexXiao95/Multi-camera-Vehicle-Tracking-and-Reidentification

KunalArora commented 4 years ago

Okay, thank you soo much for the response. One more thing, Is it possible to train and run this on CPU only without any GPU support?

zhengthomastang commented 4 years ago

Yes. It is possible to extract features with CPU only. You can also try more advanced pre-trained models in PyTorch, which are probably easier for inference on CPU.

haroonrashid235 commented 4 years ago

@KunalArora Have you been able to get the multi-camera tracker work? I am working on a similar problem and want to know what modifications are needed to get this working asap(apart from changing the backend detector).

haroonrashid235 commented 4 years ago

@zhengthomastang

demo.mp4

Is this the result of your demo? If yes, can you please confirm whether I can extend it to multiple targets? The demo shows only 1 target vehicle which it is tracking across multiple cameras. Can you also please comment of fps that you are getting?

zhengthomastang commented 4 years ago

@haroonrashid235

Yes. The demo was generated using the code in this repository. You can use it to extend to multiple targets. For the work in the 2018 challenge, we only selected the targets with the highest confidence because there are too many false positives. We didn't compute the FPS because the pipeline was broken down into multiple modules. There still needs further work to combine them into an end-to-end framework.

KunalArora commented 4 years ago

@haroonrashid235 I am still working on this task of making it work for people and developing the end-to-end pipeline from detection till tracking.

@zhengthomastang
I would really appreciate your help in letting me know what could be done to develop the end-to-end pipeline. A general guideline or idea would help a lot.

zhengthomastang commented 4 years ago

@KunalArora You can refer to my paper to get an idea of the workflow of multi-target multi-camera (MTMC) tracking: https://zhengthomastang.github.io/publications/CityFlow/