when training it does not use all the datasets and results are not the same

AndresOsp commented 3 years ago

Hello,

Before anything thanks for sharing your amazing work.

I recreating the train results using the "mix" dataset and the DLA34 model.

I downloaded and followed the instructions in the readme by running: sh experiments/mix_dla34.sh.

However i noticed that the dataloader dont use 2 datasets as the output in the terminal is:

Using tensorboardX
Fix size testing.
training chunk_sizes: [6, 6]
The output will be saved to  /workspace/FairMOT/src/lib/../../exp/mot/mix_dla34
Setting up data...
================================================================================
dataset summary
OrderedDict([('mot17', 1639.0), ('caltech', 1043.0), ('citypersons', 0), ('cuhksysu', 11931.0), ('prw', 933.0), ('eth', 0)])
total # identities: 15547
start index
OrderedDict([('mot17', 0), ('caltech', 1639.0), ('citypersons', 2682.0), ('cuhksysu', 2682.0), ('prw', 14613.0), ('eth', 15546.0)])
================================================================================
heads {'hm': 1, 'wh': 4, 'id': 128, 'reg': 2}

it says that the code do not use citypersons and eth. This is a normal behaviour? I tried to debug and i noticed that the code in the jde.py filter out those images. This is the part that filter the images:

        for ds, label_paths in self.label_files.items():
            max_index = -1
            for lp in label_paths:
                lb = np.loadtxt(lp)
                if len(lb) < 1:
                    continue
                if len(lb.shape) < 2:
                    img_max = lb[1]
                else:
                    img_max = np.max(lb[:, 1])
                if img_max > max_index:
                    max_index = img_max
            self.tid_num[ds] = max_index + 1

From my analysis it filter the images without an ID. However this is unexpected. I downloaded the datasets again but this did not solve my issue.

Then, i decided to train the model ignoring that part. The results are not the same as the presented (in epoch 30). I run: python track.py mot --load_model ../models/fairmot_dla34.pth --conf_thres 0.6

The results for MOT17 are:

  | IDF1 | IDP | IDR | Rcll | Prcn | GT | MT | PT | ML | FP | FN | IDs | FM | MOTA | MOTP
-- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | --
MOT17-02-SDP | 56.30% | 84.80% | 42.10% | 49.60% | 99.90% | 62 | 11 | 33 | 18 | 13 | 9358 | 68 | 497 | 49.20% | 0.181
MOT17-04-SDP | 87.40% | 91.70% | 83.50% | 90.80% | 99.70% | 83 | 63 | 16 | 4 | 110 | 4383 | 34 | 537 | 90.50% | 0.158
MOT17-05-SDP | 73.90% | 92.70% | 61.50% | 66.10% | 99.60% | 133 | 28 | 64 | 41 | 20 | 2348 | 34 | 173 | 65.30% | 0.161
MOT17-09-SDP | 69.60% | 79.10% | 62.20% | 77.00% | 97.90% | 26 | 15 | 10 | 1 | 88 | 1225 | 25 | 91 | 74.90% | 0.153
MOT17-10-SDP | 55.60% | 88.20% | 40.60% | 45.50% | 99.00% | 57 | 13 | 17 | 27 | 59 | 6994 | 33 | 433 | 44.80% | 0.188
MOT17-11-SDP | 84.80% | 95.30% | 76.40% | 79.60% | 99.40% | 75 | 31 | 25 | 19 | 49 | 1924 | 19 | 152 | 78.90% | 0.142
MOT17-13-SDP | 62.80% | 94.90% | 47.00% | 49.30% | 99.60% | 110 | 20 | 53 | 37 | 21 | 5901 | 39 | 785 | 48.80% | 0.189
OVERALL | 75.70% | 90.60% | 65.00% | 71.40% | 99.60% | 546 | 181 | 218 | 147 | 360 | 32133 | 252 | 2668 | 70.80% | 0.163

Different to the results that i get by testing your model:

  | IDF1 | IDP | IDR | Rcll | Prcn | GT | MT | PT | ML | FP | FN | IDs | FM | MOTA | MOTP
-- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | --
MOT17-02-SDP | 55.20% | 81.30% | 41.70% | 51.30% | 99.90% | 62 | 12 | 29 | 21 | 7 | 9052 | 62 | 480 | 50.90% | 0.18
MOT17-04-SDP | 91.20% | 95.10% | 87.70% | 92.10% | 99.90% | 83 | 60 | 18 | 5 | 59 | 3779 | 30 | 429 | 91.90% | 0.161
MOT17-05-SDP | 76.60% | 93.80% | 64.80% | 69.00% | 99.80% | 133 | 33 | 71 | 29 | 8 | 2144 | 34 | 159 | 68.40% | 0.172
MOT17-09-SDP | 75.90% | 84.60% | 68.80% | 79.80% | 98.10% | 26 | 16 | 10 | 0 | 83 | 1076 | 15 | 85 | 78.00% | 0.16
MOT17-10-SDP | 57.90% | 91.00% | 42.50% | 46.30% | 99.20% | 57 | 14 | 18 | 25 | 45 | 6893 | 31 | 399 | 45.70% | 0.192
MOT17-11-SDP | 83.90% | 93.90% | 75.80% | 80.00% | 99.20% | 75 | 29 | 27 | 19 | 59 | 1886 | 19 | 172 | 79.20% | 0.149
MOT17-13-SDP | 66.60% | 93.00% | 51.80% | 55.40% | 99.40% | 110 | 29 | 45 | 36 | 41 | 5195 | 47 | 719 | 54.60% | 0.201
OVERALL | 78.20% | 92.30% | 67.90% | 73.30% | 99.60% | 546 | 193 | 218 | 135 | 302 | 30025 | 238 | 2443 | 72.80% | 0.168

In conclusion:

It is normal that the algorithm do not load the images from ETH and Citypersons?
The results are not the same, is because of this? or i am using a different epoch?
The results that i got are different form your readme, this is normal?

Thanks for your help.

Cordially

ifzhang commented 3 years ago

We only use ETH and Citypersons to train the detection branch because these two datasets do not have id annotations. It indeed loads images from ETH and Citypersons.
Our model fairmot_dla34.pth is pretrained on CrowdHuman dataset and that brings some gain compared to only training the model on "mix" dataset.
You can set --conf_thres 0.4 to get better results on MOT17 dataset.

AndresOsp commented 3 years ago

Thank you for your answer.

Then (if i am not wrong). By running: sh experiments/mix_dla34.sh

The images from ETH and Citypersons should be loaded. Therefore i have a problem running the code?

ifzhang commented 3 years ago

Just run sh experiments/mix_dla34.sh and you can load the images from ETH and CityPersons.

AndresOsp commented 3 years ago

I did that as shown:

Using tensorboardX
Fix size testing.
training chunk_sizes: [6, 6]
The output will be saved to  /workspace/FairMOT/src/lib/../../exp/mot/mix_dla34
Setting up data...
================================================================================
dataset summary
OrderedDict([('mot17', 1639.0), ('caltech', 1043.0), ('citypersons', 0), ('cuhksysu', 11931.0), ('prw', 933.0), ('eth', 0)])
total # identities: 15547
start index
OrderedDict([('mot17', 0), ('caltech', 1639.0), ('citypersons', 2682.0), ('cuhksysu', 2682.0), ('prw', 14613.0), ('eth', 15546.0)])
================================================================================
heads {'hm': 1, 'wh': 4, 'id': 128, 'reg': 2}

The algorithm so not use those images to train the detections.

ifzhang commented 3 years ago

Ah, I see. The output is the same as mine. The number means the number of IDs instead of images. So citypersons and eth show 0. It indeed uses the images of ETH and Citypersons to train the detection branch. Do not worry about that.

AndresOsp commented 3 years ago

Thank you for your quick answer.

Then, i trained the model using: sh experiments/mix_dla34.sh The trained model do not match the results of fairmot_dla34.pth. By running: python track.py mot --load_model ../models/fairmot_dla34.pth --conf_thres 0.6 vs python track.py mot --load_model ../exp/mot/mix_dla34/model_last.pth --conf_thres 0.6 Do you have any insides about this difference?

Regards

ifzhang / FairMOT

when training it does not use all the datasets and results are not the same #240