timmeinhardt / trackformer

Implementation of "TrackFormer: Multi-Object Tracking with Transformers”. [Conference on Computer Vision and Pattern Recognition (CVPR), 2022]
https://arxiv.org/abs/2101.02702
Apache License 2.0
530 stars 119 forks source link

Custom dataset: class_error: 100 #100

Open dethresearcher opened 1 year ago

dethresearcher commented 1 year ago

Hello,

I'm trying to train a custom dataset using the private detection setting (since I plan to eventually swap out the default detector with my own). I would like to train the tracker (including the detector) simultaneously and from scratch, without any pretraining.

I am using the following command:

python -m torch.distributed.launch --nproc_per_node=2 --use_env src/train.py with \ mot17 \ deformable \ multi_frame \ tracking \ resume=models/mot17_crowdhuman_deformable_multi_frame/checkpoint_epoch_40.pth \ output_dir=models/custom_deformable_multi_frame \ mot_path_train=data/xxx \ mot_path_val=data/xxx \ train_split=xxx_train_coco \ val_split=xxx_test_coco \ epochs=20

The training works until it starts the first evaluation:

  1. "datasets/tracking/factory.py", line 59, in init assert dataset in DATASETS, f"[!] Dataset not found: {dataset}"

I understand the error is relating to the dataset not defined under tracking/.*, but I wanted to make sure we needed to do this step because it was not stated in the README, and that it was mentioned that a custom dataset could be used "without changing our codebase".

  1. class_error: 100, and most of the errors are 0. Is this expected?

Epoch: [1] [4900/6318] eta: 0:06:30 lr: 0.000100 class_error: 100.00 loss: 0.0000 (0.0376) loss_bbox: 0.0000 (0.0000) loss_bbox_0: 0.0000 (0.0000) loss_bbox_1: 0.0000 (0.0000) loss_bbox_2: 0.0000 (0.0000) loss_bbox_3: 0.0000 (0.0000) loss_bbox_4: 0.0000 (0.0000) loss_ce: 0.0000 (0.0078) loss_ce_0: 0.0000 (0.0013) loss_ce_1: 0.0000 (0.0049) loss_ce_2: 0.0000 (0.0078) loss_ce_3: 0.0000 (0.0077) loss_ce_4: 0.0000 (0.0081) loss_giou: 0.0000 (0.0000) loss_giou_0: 0.0000 (0.0000) loss_giou_1: 0.0000 (0.0000) loss_giou_2: 0.0000 (0.0000) loss_giou_3: 0.0000 (0.0000) loss_giou_4: 0.0000 (0.0000) cardinality_error_unscaled: 498.0000 (498.2397) cardinality_error_0_unscaled: 498.0000 (498.5926) cardinality_error_1_unscaled: 499.5000 (499.5661) cardinality_error_2_unscaled: 499.5000 (499.4653) cardinality_error_3_unscaled: 500.0000 (499.9463) cardinality_error_4_unscaled: 500.0000 (499.8592) class_error_unscaled: 100.0000 (100.0000) loss_bbox_unscaled: 0.0000 (0.0000) loss_bbox_0_unscaled: 0.0000 (0.0000) loss_bbox_1_unscaled: 0.0000 (0.0000) loss_bbox_2_unscaled: 0.0000 (0.0000) loss_bbox_3_unscaled: 0.0000 (0.0000) loss_bbox_4_unscaled: 0.0000 (0.0000) loss_ce_unscaled: 0.0000 (0.0039) loss_ce_0_unscaled: 0.0000 (0.0007) loss_ce_1_unscaled: 0.0000 (0.0025) loss_ce_2_unscaled: 0.0000 (0.0039) loss_ce_3_unscaled: 0.0000 (0.0038) loss_ce_4_unscaled: 0.0000 (0.0040) loss_giou_unscaled: 0.0000 (0.0000) loss_giou_0_unscaled: 0.0000 (0.0000) loss_giou_1_unscaled: 0.0000 (0.0000) loss_giou_2_unscaled: 0.0000 (0.0000) loss_giou_3_unscaled: 0.0000 (0.0000) loss_giou_4_unscaled: 0.0000 (0.0000) lr_backbone: 0.0000 (0.0000) time: 0.5473 data: 0.0034 max mem: 7175

  1. Does this repo train the detector + tracker together?

  2. I am trying to do evaluation with private detections, but it seems like the *_sequence.py files are calling the _sequence() method which requires public detections?

MuzziKim commented 1 year ago

same issue...

timmeinhardt commented 1 year ago
  1. The README says it can be trained without code changes. ;) But yes, to run an evaluation one has to add the dataset to the code in the factory.py file. A custom dataset might require individual evaluation code anyway. Hence, we did not give further details on that matter.
  2. The class error should go down. Also your losses are mostly zero. This does not look correct.
  3. Yes, the idea of the paper is to have a unified model which performs detection and tracking. So, I am not sure how you will swap your detector for our Deformable DETR. Unless it has a similar Transformer decoder architecture with cross-attention between image features and object queries.
  4. As mentioned in point 1 you need to add your individual eval code/file. You can copy paste one of the *_sequence.py files and adjust it to your needs, e.g., remove the requirement of public detection files.
dethresearcher commented 1 year ago

Thanks for the answer @timmeinhardt!

Regarding the losses mostly being zero -- do you have some ideas of what could be wrong? e.g. in the data annotation or maybe hyperparameters?

timmeinhardt commented 1 year ago

The data annotation is definitely the right place to look at. This could be due to wrong label indices for the background. The datasets/coco.py expects the person label to be at index 1 in the ground truth file. If your custom dataset has more than one label you need to check to code as well. This is not supported out of the box and might need some adjustment here and there.

dethresearcher commented 1 year ago

Thanks @timmeinhardt!

The problem was related to using the generate_coco_from_mot.py script with a custom dataset, by calling generate_coco_from_mot using mots=False, I had effectively set ignore: 1 for all the annotations because row[8] = -1 in my case.

Could you clarify why python src/track.py with reid is private detection? It seems as though all evaluation processes call _sequence which loads detection results from det.txt. In particular, python src/track.py with reid uses tracks such as MOT17-01-FRCNN, etc. which I thought was public detection?

timmeinhardt commented 1 year ago

The code might load the detection results but it is not using it. For a quick fix you can put your ground truth files and detections and then run private mode. See track.yaml for the config entry which enables public detections. It is set to False by default.