Closed liangxiao05 closed 4 years ago
More,I find that the coco_tracking model only 79.5M but have a nice generalization ability for object detection tasks more then some other bigger famous models such as "EfficientDet" when tested in real-world scenes , how do you train this model and do you make some tricks when training ?
Thank you for liking our projects. The code for generating a fake previous frame by augmentation is here.
We are glad to know that CenterNet works better than EfficientDet in your scenarios. One detail that might matter is that in CenterNet we can handle ignored annotation ("iscrowded" labels in COCO) easily, by simply masking out the ground truth heatmap. However I didn't see it is handled in detectron2 and mmdet. It didn't improve COCO AP but is possible to make the model generalize better in real world.
@xingyizhou about the same issue, I have a question:
In the paper you said: "Training on static images. We train a version of our model on static images only, as described in Section 4.4. The results are shown in Table 5 (3rd row, ‘Static image’). As reported in this table, training on static images gives the same performance as training on videos on the MOT dataset. Separately, we observed that training on static images is less effective on nuScenes, where framerate is low."
Is that right? Anyway, can you send in which experiment you did something like this?
@xingyizhou Thanks for your explanations.
@123alaa The ablation study on nuScenes is not provided in experiments. I used the following:
python main.py tracking,ddd --exp_id nuScenes_3Dtracking_static --dataset nuscenes --pre_hm --load_model ../models/nuScenes_3Ddetection_e140.pth --shift 0.01 --scale 0.05 --lost_disturb 0.4 --fp_disturb 0.1 --hm_disturb 0.05 --batch_size 64 --gpus 0,1,2,3 --lr 2.5e-4 --save_point 60 --max_frame_dist 1.
I haven't tuned augmentation parameters heavily for this experiment. Intuitively, --shift
and --scale
should be larger, and ideally, match the inter-frame displacement of the dataset.
Closing this for now. Feel free to reopen if you have further questions.
HI,authors, thanks for your nice work and codes shared.I'm a big fan of your "centernet" architecture. Recently I see your new CenterTrack paper,and find that you use coco static image to simulate tracking frames only by image augmentation and get a large accuracy improvement. I want to try it in normally object detection tasks and see whether it will still work,but when I look forward into the codes,I hav't found the codes referd.Can you give more details about the codes for this, thank you !