CaptainEven / MCMOT

Real time one-stage multi-class & multi-object tracking based on anchor-free detection and ReID
MIT License
383 stars 82 forks source link

reid向量包含类别信息嘛? #69

Closed starsky68 closed 3 years ago

starsky68 commented 3 years ago

这样的方式,训练的结果,reid向量包含多类别的特征嘛?或者说,如果一个类别对应一个reid向量的话,是不是多个类别是多个类别的reid向量,请问,如果最终网络的输出只有一种reid,多个类别的信息,能否保证互相不影响?

CaptainEven commented 3 years ago

@不会相互影响, ReID特征向量跟object class无关,跟instance才有关。如果需要定量测量feature vector的区分度或聚合度,可以参考https://github.com/CaptainEven/YOLOV4_MCMOT中的evaluate_feature_matching.py ssh://jaya@192.168.1.211:22/usr/bin/python3 -u /mnt/diskb/even/YOLOV4/MOTEvaluate/evaluate_feature_matching.py Apex recommended for faster mixed precision training: https://github.com/NVIDIA/apex {0: 524, 1: 148, 2: 185, 3: 421, 4: 56} Using gpu: 1 Using CUDA device0 _CudaDeviceProperties(name='GeForce GTX TITAN X', total_memory=12212MB)

Darknet mode: track Output reid feature map layer ids: [-1] FC layer type: FC Embedding dimension: 128 Model Summary: 210 layers, 2.69438e+06 parameters, 2.69438e+06 gradients Darknet( (module_list): ModuleList( (0): Sequential( (Conv2d): Conv2d(3, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(32, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (1): Sequential( (Conv2d): Conv2d(32, 32, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(32, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (2): Sequential( (Conv2d): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False) (BatchNorm2d): BatchNorm2d(32, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (3): Sequential( (Conv2d): Conv2d(32, 16, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(16, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) ) (4): Sequential( (Conv2d): Conv2d(16, 96, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(96, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (5): Sequential( (Conv2d): Conv2d(96, 96, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=96, bias=False) (BatchNorm2d): BatchNorm2d(96, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (6): Sequential( (Conv2d): Conv2d(96, 24, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(24, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) ) (7): Sequential( (Conv2d): Conv2d(24, 144, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(144, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (8): Sequential( (Conv2d): Conv2d(144, 144, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=144, bias=False) (BatchNorm2d): BatchNorm2d(144, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (9): Sequential( (Conv2d): Conv2d(144, 24, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(24, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) ) (10): Sequential( (WeightedFeatureFusion): WeightedFeatureFusion() ) (11): Sequential( (Conv2d): Conv2d(24, 144, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(144, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (12): Sequential( (Conv2d): Conv2d(144, 144, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=144, bias=False) (BatchNorm2d): BatchNorm2d(144, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (13): Sequential( (Conv2d): Conv2d(144, 32, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(32, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) ) (14): Sequential( (Conv2d): Conv2d(32, 192, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(192, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (15): Sequential( (Conv2d): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=192, bias=False) (BatchNorm2d): BatchNorm2d(192, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (16): Sequential( (Conv2d): Conv2d(192, 32, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(32, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) ) (17): Sequential( (WeightedFeatureFusion): WeightedFeatureFusion() ) (18): Sequential( (Conv2d): Conv2d(32, 192, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(192, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (19): Sequential( (Conv2d): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=192, bias=False) (BatchNorm2d): BatchNorm2d(192, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (20): Sequential( (Conv2d): Conv2d(192, 32, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(32, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) ) (21): Sequential( (WeightedFeatureFusion): WeightedFeatureFusion() ) (22): Sequential( (Conv2d): Conv2d(32, 192, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(192, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (23): Sequential( (Conv2d): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=192, bias=False) (BatchNorm2d): BatchNorm2d(192, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (24): Sequential( (Conv2d): Conv2d(192, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(64, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) ) (25): Sequential( (Conv2d): Conv2d(64, 384, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(384, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (26): Sequential( (Conv2d): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384, bias=False) (BatchNorm2d): BatchNorm2d(384, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (27): Sequential( (Conv2d): Conv2d(384, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(64, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) ) (28): Sequential( (WeightedFeatureFusion): WeightedFeatureFusion() ) (29): Sequential( (Conv2d): Conv2d(64, 384, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(384, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (30): Sequential( (Conv2d): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384, bias=False) (BatchNorm2d): BatchNorm2d(384, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (31): Sequential( (Conv2d): Conv2d(384, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(64, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) ) (32): Sequential( (WeightedFeatureFusion): WeightedFeatureFusion() ) (33): Sequential( (Conv2d): Conv2d(64, 384, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(384, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (34): Sequential( (Conv2d): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384, bias=False) (BatchNorm2d): BatchNorm2d(384, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (35): Sequential( (Conv2d): Conv2d(384, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(64, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) ) (36): Sequential( (WeightedFeatureFusion): WeightedFeatureFusion() ) (37): Sequential( (Conv2d): Conv2d(64, 384, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(384, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (38): Sequential( (Conv2d): Conv2d(384, 384, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=384, bias=False) (BatchNorm2d): BatchNorm2d(384, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (39): Sequential( (Conv2d): Conv2d(384, 96, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(96, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) ) (40): Sequential( (Conv2d): Conv2d(96, 576, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(576, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (41): Sequential( (Conv2d): Conv2d(576, 576, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=576, bias=False) (BatchNorm2d): BatchNorm2d(576, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (42): Sequential( (Conv2d): Conv2d(576, 96, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(96, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) ) (43): Sequential( (WeightedFeatureFusion): WeightedFeatureFusion() ) (44): Sequential( (Conv2d): Conv2d(96, 576, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(576, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (45): Sequential( (Conv2d): Conv2d(576, 576, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=576, bias=False) (BatchNorm2d): BatchNorm2d(576, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (46): Sequential( (Conv2d): Conv2d(576, 96, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(96, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) ) (47): Sequential( (WeightedFeatureFusion): WeightedFeatureFusion() ) (48): Sequential( (Conv2d): Conv2d(96, 576, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(576, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (49): Sequential( (Conv2d): Conv2d(576, 576, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=576, bias=False) (BatchNorm2d): BatchNorm2d(576, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (50): Sequential( (Conv2d): Conv2d(576, 160, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(160, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) ) (51): Sequential( (Conv2d): Conv2d(160, 960, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(960, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (52): Sequential( (Conv2d): Conv2d(960, 960, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=960, bias=False) (BatchNorm2d): BatchNorm2d(960, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (53): Sequential( (Conv2d): Conv2d(960, 160, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(160, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) ) (54): Sequential( (WeightedFeatureFusion): WeightedFeatureFusion() ) (55): Sequential( (Conv2d): Conv2d(160, 960, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(960, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (56): Sequential( (Conv2d): Conv2d(960, 960, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=960, bias=False) (BatchNorm2d): BatchNorm2d(960, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (57): Sequential( (Conv2d): Conv2d(960, 160, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(160, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) ) (58): Sequential( (WeightedFeatureFusion): WeightedFeatureFusion() ) (59): MaxPool2d(kernel_size=3, stride=1, padding=1, dilation=1, ceil_mode=False) (60): FeatureConcat() (61): MaxPool2d(kernel_size=5, stride=1, padding=2, dilation=1, ceil_mode=False) (62): FeatureConcat() (63): MaxPool2d(kernel_size=9, stride=1, padding=4, dilation=1, ceil_mode=False) (64): FeatureConcat() (65): Sequential( (Conv2d): Conv2d(640, 288, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(288, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (66): Sequential( (Conv2d): Conv2d(288, 288, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=288, bias=False) (BatchNorm2d): BatchNorm2d(288, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (67): Sequential( (Conv2d): Conv2d(288, 96, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(96, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (68): Sequential( (Conv2d): Conv2d(96, 384, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(384, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (69): Sequential( (Conv2d): Conv2d(384, 30, kernel_size=(1, 1), stride=(1, 1)) ) (70): YOLOLayer() (71): FeatureConcat() (72): Upsample(scale_factor=2.0, mode=nearest) (73): FeatureConcat() (74): Sequential( (Conv2d): Conv2d(864, 80, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(80, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (75): Sequential( (Conv2d): Conv2d(80, 288, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(288, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (76): Sequential( (Conv2d): Conv2d(288, 288, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=288, bias=False) (BatchNorm2d): BatchNorm2d(288, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (77): Sequential( (Conv2d): Conv2d(288, 192, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(192, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (78): Sequential( (Conv2d): Conv2d(192, 288, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(288, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (79): Sequential( (Conv2d): Conv2d(288, 30, kernel_size=(1, 1), stride=(1, 1)) ) (80): YOLOLayer() (81): FeatureConcat() (82): Upsample(scale_factor=2.0, mode=nearest) (83): FeatureConcat() (84): Sequential( (Conv2d): Conv2d(480, 80, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(80, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (85): Sequential( (Conv2d): Conv2d(80, 288, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(288, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (86): Sequential( (Conv2d): Conv2d(288, 288, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=288, bias=False) (BatchNorm2d): BatchNorm2d(288, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (87): Sequential( (Conv2d): Conv2d(288, 192, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(192, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (88): Sequential( (Conv2d): Conv2d(192, 288, kernel_size=(1, 1), stride=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(288, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): ReLU(inplace=True) ) (89): Sequential( (Conv2d): Conv2d(288, 30, kernel_size=(1, 1), stride=(1, 1)) ) (90): YOLOLayer() (91): FeatureConcat() (92): Sequential( (Conv2d): Conv2d(320, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (BatchNorm2d): BatchNorm2d(128, eps=1e-05, momentum=0.0, affine=True, track_running_stats=True) (activation): LeakyReLU(negative_slope=0.1, inplace=True) ) (93): Sequential( (Conv2d): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) ) (94): ModuleList( (0): Linear(in_features=128, out_features=524, bias=True) (1): Linear(in_features=128, out_features=148, bias=True) (2): Linear(in_features=128, out_features=185, bias=True) (3): Linear(in_features=128, out_features=421, bias=True) (4): Linear(in_features=128, out_features=56, bias=True) ) ) (id_classifiers): ModuleList( (0): Linear(in_features=128, out_features=524, bias=True) (1): Linear(in_features=128, out_features=148, bias=True) (2): Linear(in_features=128, out_features=185, bias=True) (3): Linear(in_features=128, out_features=421, bias=True) (4): Linear(in_features=128, out_features=56, bias=True) ) ) Cutoff: 0 Feature matcher init done. Image pre-processing method: resize Run seq /mnt/diskb/even/dataset/MCMOT_Evaluate/val_16.mp4... 0it [00:00, ?it/s]video (0/163) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_16.mp4: Frame 0 done, time: 28.72992ms Feature map size: 96×56 20it [00:00, 15.71it/s]Frame 20 done, time: 16.57176ms 28it [00:01, 22.00it/s]video (30/163) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_16.mp4: 40it [00:01, 29.35it/s]Frame 40 done, time: 16.51788ms 60it [00:01, 33.18it/s]video (60/163) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_16.mp4: Frame 60 done, time: 16.55507ms 80it [00:02, 33.83it/s]Frame 80 done, time: 16.47282ms 88it [00:02, 33.90it/s]video (90/163) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_16.mp4: 100it [00:03, 34.51it/s]Frame 100 done, time: 16.86764ms 120it [00:03, 35.47it/s]video (120/163) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_16.mp4: Frame 120 done, time: 16.48259ms 140it [00:04, 35.51it/s]Frame 140 done, time: 16.52718ms 148it [00:04, 34.65it/s]video (150/163) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_16.mp4: 160it [00:04, 35.21it/s]Frame 160 done, time: 16.55984ms 163it [00:04, 32.99it/s] Precision: 100.000%, mean cos sim: 0.991, num_TPs: 489 Seq /mnt/diskb/even/dataset/MCMOT_Evaluate/val_16.mp4 done.

Image pre-processing method: resize Run seq /mnt/diskb/even/dataset/MCMOT_Evaluate/val_19.mp4... 0it [00:00, ?it/s]video (0/319) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_19.mp4: Frame 0 done, time: 16.58630ms Feature map size: 96×56 20it [00:00, 21.65it/s]Frame 20 done, time: 16.53481ms 29it [00:01, 23.56it/s]video (30/319) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_19.mp4: 38it [00:01, 24.26it/s]Frame 40 done, time: 16.50953ms 59it [00:02, 23.82it/s]video (60/319) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_19.mp4: Frame 60 done, time: 16.54673ms 80it [00:03, 25.91it/s]Frame 80 done, time: 16.54029ms 89it [00:03, 24.87it/s]video (90/319) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_19.mp4: 98it [00:04, 23.60it/s]Frame 100 done, time: 17.04502ms 119it [00:04, 25.91it/s]video (120/319) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_19.mp4: Frame 120 done, time: 16.54387ms 140it [00:05, 26.98it/s]Frame 140 done, time: 16.54005ms 149it [00:06, 25.66it/s]video (150/319) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_19.mp4: 158it [00:06, 25.06it/s]Frame 160 done, time: 16.64877ms 179it [00:07, 24.48it/s]video (180/319) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_19.mp4: Frame 180 done, time: 16.53004ms 200it [00:08, 24.07it/s]Frame 200 done, time: 16.56103ms 209it [00:08, 24.17it/s]video (210/319) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_19.mp4: 218it [00:08, 23.32it/s]Frame 220 done, time: 16.57319ms 239it [00:09, 23.90it/s]video (240/319) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_19.mp4: Frame 240 done, time: 16.51788ms 260it [00:10, 25.58it/s]Frame 260 done, time: 16.54077ms 269it [00:10, 27.09it/s]video (270/319) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_19.mp4: 278it [00:11, 27.87it/s]Frame 280 done, time: 16.52479ms 299it [00:11, 27.83it/s]video (300/319) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_19.mp4: Frame 300 done, time: 16.53433ms 319it [00:12, 24.90it/s] Precision: 99.761%, mean cos sim: 0.994, num_TPs: 2122 Seq /mnt/diskb/even/dataset/MCMOT_Evaluate/val_19.mp4 done.

Image pre-processing method: resize Run seq /mnt/diskb/even/dataset/MCMOT_Evaluate/val_14.mp4... 0it [00:00, ?it/s]video (0/144) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_14.mp4: Frame 0 done, time: 16.56294ms Feature map size: 96×56 20it [00:00, 18.97it/s]Frame 20 done, time: 16.55149ms 29it [00:01, 24.78it/s]video (30/144) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_14.mp4: 38it [00:01, 27.88it/s]Frame 40 done, time: 16.56008ms 60it [00:02, 29.83it/s]video (60/144) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_14.mp4: Frame 60 done, time: 16.56151ms 79it [00:02, 30.21it/s]Frame 80 done, time: 16.58154ms 87it [00:03, 30.19it/s]video (90/144) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_14.mp4: 99it [00:03, 30.26it/s]Frame 100 done, time: 16.50143ms 119it [00:04, 29.98it/s]video (120/144) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_14.mp4: Frame 120 done, time: 16.53004ms 138it [00:04, 30.03it/s]Frame 140 done, time: 16.55507ms 144it [00:04, 28.81it/s] Precision: 100.000%, mean cos sim: 0.998, num_TPs: 730 Seq /mnt/diskb/even/dataset/MCMOT_Evaluate/val_14.mp4 done.

Image pre-processing method: resize Run seq /mnt/diskb/even/dataset/MCMOT_Evaluate/val_17.mp4... 0it [00:00, ?it/s]video (0/130) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_17.mp4: Frame 0 done, time: 16.64281ms Feature map size: 96×56 17it [00:00, 21.65it/s]Frame 20 done, time: 16.55459ms 29it [00:00, 29.89it/s]video (30/130) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_17.mp4: 37it [00:01, 31.34it/s]Frame 40 done, time: 16.55722ms 57it [00:01, 36.00it/s]video (60/130) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_17.mp4: Frame 60 done, time: 16.53624ms 77it [00:02, 36.83it/s]Frame 80 done, time: 16.53838ms 89it [00:02, 37.15it/s]video (90/130) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_17.mp4: 97it [00:02, 37.23it/s]Frame 100 done, time: 16.52360ms 117it [00:03, 36.28it/s]video (120/130) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_17.mp4: Frame 120 done, time: 16.54935ms 130it [00:03, 35.53it/s] Precision: 100.000%, mean cos sim: 0.999, num_TPs: 263 Seq /mnt/diskb/even/dataset/MCMOT_Evaluate/val_17.mp4 done.

Image pre-processing method: resize Run seq /mnt/diskb/even/dataset/MCMOT_Evaluate/val_11.mp4... 0it [00:00, ?it/s]video (0/349) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_11.mp4: Frame 0 done, time: 16.75916ms Feature map size: 96×56 18it [00:00, 15.74it/s]Frame 20 done, time: 16.53981ms 30it [00:01, 24.58it/s]video (30/349) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_11.mp4: 40it [00:01, 27.81it/s]Frame 40 done, time: 16.57081ms 58it [00:02, 29.47it/s]video (60/349) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_11.mp4: Frame 60 done, time: 16.52122ms 78it [00:02, 30.08it/s]Frame 80 done, time: 16.56795ms 89it [00:03, 30.07it/s]video (90/349) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_11.mp4: 97it [00:03, 30.71it/s]Frame 100 done, time: 16.57891ms 117it [00:04, 32.34it/s]video (120/349) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_11.mp4: Frame 120 done, time: 16.52789ms 137it [00:04, 32.64it/s]Frame 140 done, time: 16.54100ms 149it [00:04, 32.54it/s]video (150/349) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_11.mp4: 157it [00:05, 32.45it/s]Frame 160 done, time: 16.55507ms 177it [00:05, 31.53it/s]video (180/349) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_11.mp4: Frame 180 done, time: 16.53838ms 197it [00:06, 30.71it/s]Frame 200 done, time: 16.56365ms 209it [00:06, 30.56it/s]video (210/349) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_11.mp4: 217it [00:07, 31.16it/s]Frame 220 done, time: 16.58106ms 237it [00:07, 32.21it/s]video (240/349) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_11.mp4: Frame 240 done, time: 16.50476ms 257it [00:08, 32.80it/s]Frame 260 done, time: 16.52026ms 269it [00:08, 32.59it/s]video (270/349) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_11.mp4: 277it [00:09, 31.83it/s]Frame 280 done, time: 16.58082ms 297it [00:09, 30.36it/s]video (300/349) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_11.mp4: Frame 300 done, time: 16.58916ms 317it [00:10, 30.24it/s]Frame 320 done, time: 16.55984ms 329it [00:10, 30.18it/s]video (330/349) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_11.mp4: 337it [00:10, 30.96it/s]Frame 340 done, time: 16.51311ms 349it [00:11, 30.71it/s] Precision: 100.000%, mean cos sim: 0.989, num_TPs: 1548 Seq /mnt/diskb/even/dataset/MCMOT_Evaluate/val_11.mp4 done.

Image pre-processing method: resize Run seq /mnt/diskb/even/dataset/MCMOT_Evaluate/val_12.mp4... 0it [00:00, ?it/s]video (0/336) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_12.mp4: Frame 0 done, time: 16.58511ms Feature map size: 96×56 20it [00:00, 20.85it/s]Frame 20 done, time: 16.57391ms 29it [00:01, 25.65it/s]video (30/336) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_12.mp4: 38it [00:01, 27.01it/s]Frame 40 done, time: 16.55054ms 60it [00:02, 28.44it/s]video (60/336) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_12.mp4: Frame 60 done, time: 16.53957ms 80it [00:02, 30.32it/s]Frame 80 done, time: 16.54530ms 88it [00:03, 31.64it/s]video (90/336) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_12.mp4: 100it [00:03, 32.58it/s]Frame 100 done, time: 16.57629ms 120it [00:04, 34.05it/s]video (120/336) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_12.mp4: Frame 120 done, time: 16.59703ms 140it [00:04, 34.00it/s]Frame 140 done, time: 16.58988ms 148it [00:04, 33.36it/s]video (150/336) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_12.mp4: 160it [00:05, 32.86it/s]Frame 160 done, time: 16.74771ms 180it [00:05, 30.82it/s]video (180/336) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_12.mp4: Frame 180 done, time: 16.61539ms 197it [00:06, 28.93it/s]Frame 200 done, time: 16.58320ms 209it [00:06, 30.53it/s]video (210/336) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_12.mp4: 217it [00:07, 31.16it/s]Frame 220 done, time: 16.60681ms 237it [00:07, 32.19it/s]video (240/336) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_12.mp4: Frame 240 done, time: 16.57343ms 257it [00:08, 35.42it/s]Frame 260 done, time: 16.60872ms 269it [00:08, 37.05it/s]video (270/336) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_12.mp4: 277it [00:08, 37.48it/s]Frame 280 done, time: 16.56890ms 297it [00:09, 36.12it/s]video (300/336) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_12.mp4: Frame 300 done, time: 16.56675ms 317it [00:09, 35.10it/s]Frame 320 done, time: 16.60419ms 329it [00:10, 35.30it/s]video (330/336) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_12.mp4: 336it [00:10, 32.12it/s] Precision: 100.000%, mean cos sim: 0.978, num_TPs: 1276 Seq /mnt/diskb/even/dataset/MCMOT_Evaluate/val_12.mp4 done.

Image pre-processing method: resize Run seq /mnt/diskb/even/dataset/MCMOT_Evaluate/val_0.mp4... 0it [00:00, ?it/s]video (0/382) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_0.mp4: Frame 0 done, time: 16.64495ms Feature map size: 96×56 20it [00:01, 11.31it/s]Frame 20 done, time: 16.58797ms 30it [00:02, 12.16it/s]video (30/382) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_0.mp4: 40it [00:03, 12.37it/s]Frame 40 done, time: 18.37564ms 60it [00:05, 12.89it/s]video (60/382) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_0.mp4: Frame 60 done, time: 18.36801ms 80it [00:06, 12.99it/s]Frame 80 done, time: 18.38374ms 90it [00:07, 12.88it/s]video (90/382) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_0.mp4: 100it [00:08, 13.71it/s]Frame 100 done, time: 18.40258ms 120it [00:09, 14.34it/s]video (120/382) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_0.mp4: Frame 120 done, time: 18.40949ms 140it [00:11, 13.49it/s]Frame 140 done, time: 18.38851ms 150it [00:11, 13.93it/s]video (150/382) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_0.mp4: 160it [00:12, 13.96it/s]Frame 160 done, time: 18.36991ms 180it [00:13, 14.58it/s]video (180/382) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_0.mp4: Frame 180 done, time: 18.38827ms 200it [00:15, 13.10it/s]Frame 200 done, time: 18.36586ms 210it [00:16, 12.67it/s]video (210/382) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_0.mp4: 220it [00:16, 11.99it/s]Frame 220 done, time: 18.43500ms 240it [00:18, 12.59it/s]video (240/382) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_0.mp4: Frame 240 done, time: 18.36419ms 260it [00:20, 12.23it/s]Frame 260 done, time: 18.40162ms 270it [00:21, 12.62it/s]video (270/382) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_0.mp4: 280it [00:21, 12.65it/s]Frame 280 done, time: 18.38422ms 300it [00:23, 12.52it/s]video (300/382) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_0.mp4: Frame 300 done, time: 18.36610ms 320it [00:25, 12.54it/s]Frame 320 done, time: 18.38064ms 330it [00:25, 12.12it/s]video (330/382) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_0.mp4: 340it [00:26, 12.68it/s]Frame 340 done, time: 18.39352ms 360it [00:28, 11.56it/s]video (360/382) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_0.mp4: Frame 360 done, time: 18.38040ms 380it [00:29, 12.53it/s]Frame 380 done, time: 18.37778ms 382it [00:30, 12.71it/s] Precision: 100.000%, mean cos sim: 0.992, num_TPs: 4890 Seq /mnt/diskb/even/dataset/MCMOT_Evaluate/val_0.mp4 done.

Image pre-processing method: resize Run seq /mnt/diskb/even/dataset/MCMOT_Evaluate/val_10.mp4... 0it [00:00, ?it/s]video (0/294) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_10.mp4: Frame 0 done, time: 18.47434ms Feature map size: 96×56 18it [00:00, 19.02it/s]Frame 20 done, time: 16.51955ms 30it [00:01, 25.94it/s]video (30/294) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_10.mp4: 38it [00:01, 29.23it/s]Frame 40 done, time: 16.61563ms 58it [00:01, 32.57it/s]video (60/294) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_10.mp4: Frame 60 done, time: 16.59727ms 78it [00:02, 32.93it/s]Frame 80 done, time: 16.57367ms 90it [00:02, 33.00it/s]video (90/294) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_10.mp4: 98it [00:03, 33.03it/s]Frame 100 done, time: 16.58440ms 118it [00:03, 32.85it/s]video (120/294) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_10.mp4: Frame 120 done, time: 16.60037ms 138it [00:04, 29.89it/s]Frame 140 done, time: 16.59727ms 150it [00:04, 31.47it/s]video (150/294) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_10.mp4: 158it [00:05, 31.87it/s]Frame 160 done, time: 16.56795ms 178it [00:05, 32.42it/s]video (180/294) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_10.mp4: Frame 180 done, time: 16.58726ms 198it [00:06, 32.53it/s]Frame 200 done, time: 16.56103ms 210it [00:06, 32.73it/s]video (210/294) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_10.mp4: 218it [00:06, 32.20it/s]Frame 220 done, time: 16.58678ms 238it [00:07, 33.60it/s]video (240/294) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_10.mp4: Frame 240 done, time: 16.58416ms 258it [00:08, 30.97it/s]Frame 260 done, time: 16.58106ms 270it [00:08, 31.58it/s]video (270/294) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_10.mp4: 278it [00:08, 32.16it/s]Frame 280 done, time: 16.60991ms 294it [00:09, 31.74it/s] Precision: 99.665%, mean cos sim: 0.984, num_TPs: 1205 Seq /mnt/diskb/even/dataset/MCMOT_Evaluate/val_10.mp4 done.

Image pre-processing method: resize Run seq /mnt/diskb/even/dataset/MCMOT_Evaluate/val_15.mp4... 0it [00:00, ?it/s]video (0/273) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_15.mp4: Frame 0 done, time: 16.63160ms Feature map size: 96×56 19it [00:00, 20.57it/s]Frame 20 done, time: 16.59727ms 28it [00:01, 25.26it/s]video (30/273) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_15.mp4: 38it [00:01, 27.52it/s]Frame 40 done, time: 16.58368ms 59it [00:02, 27.32it/s]video (60/273) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_15.mp4: Frame 60 done, time: 16.61849ms 80it [00:02, 27.31it/s]Frame 80 done, time: 16.62445ms 89it [00:03, 27.30it/s]video (90/273) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_15.mp4: 99it [00:03, 28.89it/s]Frame 100 done, time: 16.63733ms 119it [00:04, 30.01it/s]video (120/273) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_15.mp4: Frame 120 done, time: 16.59703ms 139it [00:04, 30.21it/s]Frame 140 done, time: 16.68978ms 147it [00:05, 29.64it/s]video (150/273) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_15.mp4: 157it [00:05, 29.51it/s]Frame 160 done, time: 16.66093ms 180it [00:06, 28.88it/s]video (180/273) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_15.mp4: Frame 180 done, time: 16.58654ms 198it [00:07, 25.67it/s]Frame 200 done, time: 16.58106ms 210it [00:07, 26.60it/s]video (210/273) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_15.mp4: 219it [00:07, 27.08it/s]Frame 220 done, time: 16.60180ms 240it [00:08, 27.42it/s]video (240/273) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_15.mp4: Frame 240 done, time: 16.58988ms 258it [00:09, 29.53it/s]Frame 260 done, time: 16.58177ms 269it [00:09, 29.89it/s]video (270/273) /mnt/diskb/even/dataset/MCMOT_Evaluate/val_15.mp4: 273it [00:09, 28.13it/s] Precision: 100.000%, mean cos sim: 0.990, num_TPs: 1474 Seq /mnt/diskb/even/dataset/MCMOT_Evaluate/val_15.mp4 done.

defaultdict(<class 'int'>, {0: 0, 5: 0, 10: 0, 15: 0, 20: 0, 25: 0, 30: 0, 35: 0, 40: 0, 45: 0, 50: 0, 55: 0, 60: 0, 65: 1, 70: 1, 75: 11, 80: 29, 85: 97, 90: 630, 95: 13116}) defaultdict(<class 'int'>, {0: 0, 5: 0, 10: 0, 15: 0, 20: 0, 25: 0, 30: 0, 35: 0, 40: 0, 45: 0, 50: 0, 55: 0, 60: 0, 65: 0, 70: 2, 75: 2, 80: 0, 85: 0, 90: 2, 95: 3}) Wrong [ 0, 5]: 0.000 Wrong [ 5, 10]: 0.000 Wrong [ 10, 15]: 0.000 Wrong [ 15, 20]: 0.000 Wrong [ 20, 25]: 0.000 Wrong [ 25, 30]: 0.000 Wrong [ 30, 35]: 0.000 Wrong [ 35, 40]: 0.000 Wrong [ 40, 45]: 0.000 Wrong [ 45, 50]: 0.000 Wrong [ 50, 55]: 0.000 Wrong [ 55, 60]: 0.000 Wrong [ 60, 65]: 0.000 Wrong [ 65, 70]: 0.000 Wrong [ 70, 75]: 0.014 Wrong [ 75, 80]: 0.014 Wrong [ 80, 85]: 0.000 Wrong [ 85, 90]: 0.000 Wrong [ 90, 95]: 0.014 Wrong [ 95, 100]: 0.022 Correct [ 0, 5]: 0.000 Correct [ 5, 10]: 0.000 Correct [ 10, 15]: 0.000 Correct [ 15, 20]: 0.000 Correct [ 20, 25]: 0.000 Correct [ 25, 30]: 0.000 Correct [ 30, 35]: 0.000 Correct [ 35, 40]: 0.000 Correct [ 40, 45]: 0.000 Correct [ 45, 50]: 0.000 Correct [ 50, 55]: 0.000 Correct [ 55, 60]: 0.000 Correct [ 60, 65]: 0.000 Correct [ 65, 70]: 0.007 Correct [ 70, 75]: 0.007 Correct [ 75, 80]: 0.079 Correct [ 80, 85]: 0.209 Correct [ 85, 90]: 0.698 Correct [ 90, 95]: 4.534 Correct [ 95, 100]: 94.400 Ratio [ 0, 5]: 0.000 Ratio [ 5, 10]: 0.000 Ratio [ 10, 15]: 0.005 Ratio [ 15, 20]: 0.015 Ratio [ 20, 25]: 0.055 Ratio [ 25, 30]: 0.123 Ratio [ 30, 35]: 0.689 Ratio [ 35, 40]: 1.281 Ratio [ 40, 45]: 2.186 Ratio [ 45, 50]: 3.994 Ratio [ 50, 55]: 7.053 Ratio [ 55, 60]: 9.223 Ratio [ 60, 65]: 10.898 Ratio [ 65, 70]: 10.639 Ratio [ 70, 75]: 10.749 Ratio [ 75, 80]: 10.558 Ratio [ 80, 85]: 10.573 Ratio [ 85, 90]: 6.510 Ratio [ 90, 95]: 2.992 Ratio [ 95, 100]: 12.457

Total 13997 true positives detected. Total 13894 matches tested. Num total match: 13894 Correct matched number: 13885 Wrong matched number: 9 Mean precision: 99.936% Average precision: 99.935% Min same ID similarity: 0.678 Max diff ID similarity: 0.976

Process finished with exit code 0