Bad results on OTB-2015 using SiamRPNRes22 trained on dataset ['VID'] by myself

@JudasDie Thank you for your excellent research work. There are some problems when I conduct the test on OTB-2015 using SiamRPNRes22 trained on dataset ['VID'] by myself.
The best result on OTB-2015 benchmark among the SiamRPNRes22checkpoints is 0.4041.

The log of the results on OTB-2015 benchmark is following: ./result/OTB2015/SiamRPNRes22checkpoint_e39(0.3814) ./result/OTB2015/SiamRPNRes22checkpoint_e42(0.3880) ./result/OTB2015/SiamRPNRes22checkpoint_e44(0.3885) ./result/OTB2015/SiamRPNRes22checkpoint_e41(0.3974) ./result/OTB2015/SiamRPNRes22checkpoint_e34(0.3807) ./result/OTB2015/SiamRPNRes22checkpoint_e47(0.3920) ./result/OTB2015/SiamRPNRes22checkpoint_e30(0.3890) ./result/OTB2015/SiamRPNRes22checkpoint_e49(0.3815) ./result/OTB2015/SiamRPNRes22checkpoint_e40(0.4041) ./result/OTB2015/SiamRPNRes22checkpoint_e48(0.3865) ./result/OTB2015/SiamRPNRes22checkpoint_e43(0.3840) ./result/OTB2015/SiamRPNRes22checkpoint_e36(0.3854) ./result/OTB2015/SiamRPNRes22checkpoint_e45(0.3858) ./result/OTB2015/SiamRPNRes22checkpoint_e32(0.3728) ./result/OTB2015/SiamRPNRes22checkpoint_e33(0.3527) ./result/OTB2015/SiamRPNRes22checkpoint_e35(0.3841) ./result/OTB2015/SiamRPNRes22checkpoint_e31(0.3696) ./result/OTB2015/SiamRPNRes22checkpoint_e37(0.3846) ./result/OTB2015/SiamRPNRes22checkpoint_e46(0.3926) ./result/OTB2015/SiamRPNRes22checkpoint_e50(0.3861)

OTB2015 Best: ./result/OTB2015/SiamRPNRes22checkpoint_e40(0.4041)

The log of the SiamRPNRes22 trained on dataset ['VID'] by myself is following: Epoch: [50][1480/1563] lr : 0.0000100 Batch Time: 0.397 Data Time:0.016 CLS_Loss:0.04349 REG_Loss:0.18869 Loss:0.23219 Progress: 78067 / 78150 [99%], Speed: 0.397 s/iter, ETA 0:00:00 (D:H:M)

PROGRESS: 99.89%

Epoch: [50][1490/1563] lr : 0.0000100 Batch Time: 0.397 Data Time:0.016 CLS_Loss:0.04349 REG_Loss:0.18872 Loss:0.23220 Progress: 78077 / 78150 [99%], Speed: 0.397 s/iter, ETA 0:00:00 (D:H:M)

PROGRESS: 99.91%

Epoch: [50][1500/1563] lr : 0.0000100 Batch Time: 0.397 Data Time:0.016 CLS_Loss:0.04349 REG_Loss:0.18872 Loss:0.23221 Progress: 78087 / 78150 [99%], Speed: 0.397 s/iter, ETA 0:00:00 (D:H:M)

PROGRESS: 99.92%

Epoch: [50][1510/1563] lr : 0.0000100 Batch Time: 0.397 Data Time:0.016 CLS_Loss:0.04345 REG_Loss:0.18871 Loss:0.23216 Progress: 78097 / 78150 [99%], Speed: 0.397 s/iter, ETA 0:00:00 (D:H:M)

PROGRESS: 99.93%

Epoch: [50][1520/1563] lr : 0.0000100 Batch Time: 0.397 Data Time:0.015 CLS_Loss:0.04344 REG_Loss:0.18871 Loss:0.23215 Progress: 78107 / 78150 [99%], Speed: 0.397 s/iter, ETA 0:00:00 (D:H:M)

PROGRESS: 99.94%

Epoch: [50][1530/1563] lr : 0.0000100 Batch Time: 0.397 Data Time:0.015 CLS_Loss:0.04347 REG_Loss:0.18870 Loss:0.23217 Progress: 78117 / 78150 [99%], Speed: 0.397 s/iter, ETA 0:00:00 (D:H:M)

PROGRESS: 99.96%

Epoch: [50][1540/1563] lr : 0.0000100 Batch Time: 0.397 Data Time:0.015 CLS_Loss:0.04346 REG_Loss:0.18869 Loss:0.23215 Progress: 78127 / 78150 [99%], Speed: 0.397 s/iter, ETA 0:00:00 (D:H:M)

PROGRESS: 99.97%

Epoch: [50][1550/1563] lr : 0.0000100 Batch Time: 0.397 Data Time:0.015 CLS_Loss:0.04349 REG_Loss:0.18868 Loss:0.23217 Progress: 78137 / 78150 [99%], Speed: 0.397 s/iter, ETA 0:00:00 (D:H:M)

PROGRESS: 99.98%

Epoch: [50][1560/1563] lr : 0.0000100 Batch Time: 0.397 Data Time:0.015 CLS_Loss:0.04347 REG_Loss:0.18867 Loss:0.23214 Progress: 78147 / 78150 [99%], Speed: 0.397 s/iter, ETA 0:00:00 (D:H:M)

PROGRESS: 100.00%

The SiamRPN.yaml used for training SiamRPNRes22 is following: SIAMRPN: GPUS: '0,1,2,3' PRINT_FREQ: 10 WORKERS: 32 OUTPUT_DIR: 'logs' # log file CHECKPOINT_DIR: 'snapshot' # checkpoint file

TRAIN: ISTRUE: True # whether to train MODEL: "SiamRPNRes22" START_EPOCH: 0 END_EPOCH: 50 TEMPLATE_SIZE: 127 SEARCH_SIZE: 255 STRIDE: 8 BATCH: 32 #32 RESUME: False PRETRAIN: 'CIResNet22_pretrain.model' LR_POLICY: 'log' LR: 0.01 LR_END: 0.00001 MOMENTUM: 0.9 WEIGHT_DECAY: 0.0005 #0.0005->1 CLS_WEIGHT: 1 REG_WEIGHT: 1 CLS_TYPE: 'thicker' # thicker or thinner->thinner

WHICH_USE: ['YTB', 'VID', 'COCO', 'DET'] # add any data you want eg: ['GOT10K', 'LASOT']

WHICH_USE: ['VID'] # add any data you want eg: ['GOT10K', 'LASOT'] ANCHORS_RATIOS: [0.33, 0.5, 1, 2, 3] ANCHORS_SCALES: [8] ANCHORS_THR_HIGH: 0.6 ANCHORS_THR_LOW: 0.3 ANCHORS_POS_KEEP: 16 ANCHORS_ALL_KEEP: 64 TEST: # TEST model is same as TRAIN.MODEL ISTRUE: False # whether to test True THREADS: 16 # multi threads test DATA: 'VOT2016' START_EPOCH: 20 END_EPOCH: 50 TUNE: # TUNE model is same as TRAIN.MODEL ISTRUE: False # whether to tune DATA: 'VOT2016' METHOD: 'TPE' DATASET: SHIFT: 4 SCALE: 0.05 COLOR: 1 FLIP: 0 BLUR: 0.2 ROTATION: 0

add data path in WITCH_USE

you can ablate here to find witch data and ratio is better for your task

VID:

PATH: '/data/home/zzp/data/vid/crop271'

ANNOTATION: '/data/home/zzp/data/vid/train.json'

PATH: '/home/jhvision5/SiamDW_DATA/VID/crop255' ANNOTATION: '/home/jhvision5/SiamDW_DATA/VID/train.json' RANGE: 100 USE: 200000
I don't what caused the bad results on OTB-2015 benchmark using SiamRPNRes22 trained on dataset ['VID'] by myself. The best result on OTB-2015 benchmark among the SiamRPNRes22checkpoints is 0.4041, which is far away from 0.666. Are there some problems in the training phase of SiamRPNRes22 ? Thanks for your help!

@JudasDie Thank you for your excellent research work. There are some problems when I conduct the test on OTB-2015 using SiamRPNRes22 trained on dataset ['VID'] by myself. The best result on OTB-2015 benchmark among the SiamRPNRes22checkpoints is 0.4041.

The log of the results on OTB-2015 benchmark is following: ./result/OTB2015/SiamRPNRes22checkpoint_e39(0.3814) ./result/OTB2015/SiamRPNRes22checkpoint_e42(0.3880) ./result/OTB2015/SiamRPNRes22checkpoint_e44(0.3885) ./result/OTB2015/SiamRPNRes22checkpoint_e41(0.3974) ./result/OTB2015/SiamRPNRes22checkpoint_e34(0.3807) ./result/OTB2015/SiamRPNRes22checkpoint_e47(0.3920) ./result/OTB2015/SiamRPNRes22checkpoint_e30(0.3890) ./result/OTB2015/SiamRPNRes22checkpoint_e49(0.3815) ./result/OTB2015/SiamRPNRes22checkpoint_e40(0.4041) ./result/OTB2015/SiamRPNRes22checkpoint_e48(0.3865) ./result/OTB2015/SiamRPNRes22checkpoint_e43(0.3840) ./result/OTB2015/SiamRPNRes22checkpoint_e36(0.3854) ./result/OTB2015/SiamRPNRes22checkpoint_e45(0.3858) ./result/OTB2015/SiamRPNRes22checkpoint_e32(0.3728) ./result/OTB2015/SiamRPNRes22checkpoint_e33(0.3527) ./result/OTB2015/SiamRPNRes22checkpoint_e35(0.3841) ./result/OTB2015/SiamRPNRes22checkpoint_e31(0.3696) ./result/OTB2015/SiamRPNRes22checkpoint_e37(0.3846) ./result/OTB2015/SiamRPNRes22checkpoint_e46(0.3926) ./result/OTB2015/SiamRPNRes22checkpoint_e50(0.3861)

OTB2015 Best: ./result/OTB2015/SiamRPNRes22checkpoint_e40(0.4041)

The log of the SiamRPNRes22 trained on dataset ['VID'] by myself is following: Epoch: [50][1480/1563] lr : 0.0000100 Batch Time: 0.397 Data Time:0.016 CLS_Loss:0.04349 REG_Loss:0.18869 Loss:0.23219 Progress: 78067 / 78150 [99%], Speed: 0.397 s/iter, ETA 0:00:00 (D:H:M)

PROGRESS: 99.89%

Epoch: [50][1490/1563] lr : 0.0000100 Batch Time: 0.397 Data Time:0.016 CLS_Loss:0.04349 REG_Loss:0.18872 Loss:0.23220 Progress: 78077 / 78150 [99%], Speed: 0.397 s/iter, ETA 0:00:00 (D:H:M)

PROGRESS: 99.91%

Epoch: [50][1500/1563] lr : 0.0000100 Batch Time: 0.397 Data Time:0.016 CLS_Loss:0.04349 REG_Loss:0.18872 Loss:0.23221 Progress: 78087 / 78150 [99%], Speed: 0.397 s/iter, ETA 0:00:00 (D:H:M)

PROGRESS: 99.92%

Epoch: [50][1510/1563] lr : 0.0000100 Batch Time: 0.397 Data Time:0.016 CLS_Loss:0.04345 REG_Loss:0.18871 Loss:0.23216 Progress: 78097 / 78150 [99%], Speed: 0.397 s/iter, ETA 0:00:00 (D:H:M)

PROGRESS: 99.93%

Epoch: [50][1520/1563] lr : 0.0000100 Batch Time: 0.397 Data Time:0.015 CLS_Loss:0.04344 REG_Loss:0.18871 Loss:0.23215 Progress: 78107 / 78150 [99%], Speed: 0.397 s/iter, ETA 0:00:00 (D:H:M)

PROGRESS: 99.94%

Epoch: [50][1530/1563] lr : 0.0000100 Batch Time: 0.397 Data Time:0.015 CLS_Loss:0.04347 REG_Loss:0.18870 Loss:0.23217 Progress: 78117 / 78150 [99%], Speed: 0.397 s/iter, ETA 0:00:00 (D:H:M)

PROGRESS: 99.96%

Epoch: [50][1540/1563] lr : 0.0000100 Batch Time: 0.397 Data Time:0.015 CLS_Loss:0.04346 REG_Loss:0.18869 Loss:0.23215 Progress: 78127 / 78150 [99%], Speed: 0.397 s/iter, ETA 0:00:00 (D:H:M)

PROGRESS: 99.97%

Epoch: [50][1550/1563] lr : 0.0000100 Batch Time: 0.397 Data Time:0.015 CLS_Loss:0.04349 REG_Loss:0.18868 Loss:0.23217 Progress: 78137 / 78150 [99%], Speed: 0.397 s/iter, ETA 0:00:00 (D:H:M)

PROGRESS: 99.98%

Epoch: [50][1560/1563] lr : 0.0000100 Batch Time: 0.397 Data Time:0.015 CLS_Loss:0.04347 REG_Loss:0.18867 Loss:0.23214 Progress: 78147 / 78150 [99%], Speed: 0.397 s/iter, ETA 0:00:00 (D:H:M)

PROGRESS: 100.00%

The SiamRPN.yaml used for training SiamRPNRes22 is following: SIAMRPN: GPUS: '0,1,2,3' PRINT_FREQ: 10 WORKERS: 32 OUTPUT_DIR: 'logs' # log file CHECKPOINT_DIR: 'snapshot' # checkpoint file

TRAIN: ISTRUE: True # whether to train MODEL: "SiamRPNRes22" START_EPOCH: 0 END_EPOCH: 50 TEMPLATE_SIZE: 127 SEARCH_SIZE: 255 STRIDE: 8 BATCH: 32 #32 RESUME: False PRETRAIN: 'CIResNet22_pretrain.model' LR_POLICY: 'log' LR: 0.01 LR_END: 0.00001 MOMENTUM: 0.9 WEIGHT_DECAY: 0.0005 #0.0005->1 CLS_WEIGHT: 1 REG_WEIGHT: 1 CLS_TYPE: 'thicker' # thicker or thinner->thinner

WHICH_USE: ['YTB', 'VID', 'COCO', 'DET'] # add any data you want eg: ['GOT10K', 'LASOT']

WHICH_USE: ['VID'] # add any data you want eg: ['GOT10K', 'LASOT'] ANCHORS_RATIOS: [0.33, 0.5, 1, 2, 3] ANCHORS_SCALES: [8] ANCHORS_THR_HIGH: 0.6 ANCHORS_THR_LOW: 0.3 ANCHORS_POS_KEEP: 16 ANCHORS_ALL_KEEP: 64 TEST: # TEST model is same as TRAIN.MODEL ISTRUE: False # whether to test True THREADS: 16 # multi threads test DATA: 'VOT2016' START_EPOCH: 20 END_EPOCH: 50 TUNE: # TUNE model is same as TRAIN.MODEL ISTRUE: False # whether to tune DATA: 'VOT2016' METHOD: 'TPE' DATASET: SHIFT: 4 SCALE: 0.05 COLOR: 1 FLIP: 0 BLUR: 0.2 ROTATION: 0

add data path in WITCH_USE

you can ablate here to find witch data and ratio is better for your task

VID:

PATH: '/data/home/zzp/data/vid/crop271'

ANNOTATION: '/data/home/zzp/data/vid/train.json'

PATH: '/home/jhvision5/SiamDW_DATA/VID/crop255' ANNOTATION: '/home/jhvision5/SiamDW_DATA/VID/train.json' RANGE: 100 USE: 200000

I don't what caused the bad results on OTB-2015 benchmark using SiamRPNRes22 trained on dataset ['VID'] by myself. The best result on OTB-2015 benchmark among the SiamRPNRes22checkpoints is 0.4041, which is far away from 0.666. Are there some problems in the training phase of SiamRPNRes22 ? Thanks for your help!

Hi, thanks for your interest. Of course it can't provide good results with only VID. RPN based models need more data to converge. BTW, I notice that you didn't tune hyper-parameters. Here is my advice. If you don't have large scale dataset like YTB or limit by the machine recourse, you can try SiamFC based models on GOT10K. Using 20w pair a epoch is enough. Try SiamFCRes22 + GOT10K first, and try SiamFCRes22W later. If you have another questions, please email me (zhangzhipeng2017@ia.ac.cn).

Thank you for your responses very much! I have also tried the SiamRPNRes22+VID+GOT10K, but the best result is still 0.4+. According to your responses, is the training dataset [VID+GOT10K] too small for the training phase of SiamRPNRes22 ? Is the training pairs of the training dataset [VID+GOT10K] enough for the training phase of SiamFCRes22 or SiamFCRes22W, but not for SiamRPNRes22 ? If I want to train the SiamRPNRes22, is the dataset [YTB] or other larger dataset necessary?

Thank you for your responses very much! I have also tried the SiamRPNRes22+VID+GOT10K, but the best result is still 0.4+. According to your responses, is the training dataset [VID+GOT10K] too small for the training phase of SiamRPNRes22 ? Is the training pairs of the training dataset [VID+GOT10K] enough for the training phase of SiamFCRes22 or SiamFCRes22W, but not for SiamRPNRes22 ? If I want to train the SiamRPNRes22, is the dataset [YTB] or other larger dataset necessary?

GOT10K itself is enough for SiamFC. If you want to train a good RPN, I think you should use all data you can. BTW, RPN doesn't show leading results on OTB, you may test and tune it on VOT. If you want a better result on OTB, FC is a good choice (only GOT10K is enough). See the SiamFCRes22W in readme.md.

Thank you for your responses ! I have tried the SiamFCRes22+VID(+GOT10K). The result on OTB-2015 is 0,6+, which is close to the leading results on OTB. Further, I want to train a good SiamRPNRes22. The results on OTB needn`t to be leading results, but not too bad. I am downloading the dataset [YTB] and [COCO], and then I will try SiamRPNRes22+VID+GOT10K+YTB+COCO. Hope to obtain a good result ! Thank you for your help again !

Thank you for your responses ! I have tried the SiamFCRes22+VID(+GOT10K). The result on OTB-2015 is 0,6+, which is close to the leading results on OTB. Further, I want to train a good SiamRPNRes22. The results on OTB needn`t to be leading results, but not too bad. I am downloading the dataset [YTB] and [COCO], and then I will try SiamRPNRes22+VID+GOT10K+YTB+COCO. Hope to obtain a good result ! Thank you for your help again !

You are welcome. Be free to emile me for further talk.

researchmm / SiamDW

Bad results on OTB-2015 using SiamRPNRes22 trained on dataset ['VID'] by myself #54

WHICH_USE: ['YTB', 'VID', 'COCO', 'DET'] # add any data you want eg: ['GOT10K', 'LASOT']

add data path in WITCH_USE

you can ablate here to find witch data and ratio is better for your task

PATH: '/data/home/zzp/data/vid/crop271'

ANNOTATION: '/data/home/zzp/data/vid/train.json'

WHICH_USE: ['YTB', 'VID', 'COCO', 'DET'] # add any data you want eg: ['GOT10K', 'LASOT']

add data path in WITCH_USE

you can ablate here to find witch data and ratio is better for your task

PATH: '/data/home/zzp/data/vid/crop271'

ANNOTATION: '/data/home/zzp/data/vid/train.json'