open-mmlab / mmtracking

OpenMMLab Video Perception Toolbox. It supports Video Object Detection (VID), Multiple Object Tracking (MOT), Single Object Tracking (SOT), Video Instance Segmentation (VIS) with a unified framework.
https://mmtracking.readthedocs.io/en/latest/
Apache License 2.0
3.52k stars 588 forks source link

SOT测试中,如何像如pysot、pytracking中那样获得一些测试指标的数据呢? #855

Open MonsterZUO opened 1 year ago

MonsterZUO commented 1 year ago

我使用STARKST模型测试了GOT10K以及准备测试LASOT数据集,但我发现似乎保存的PKL文件中之后预测bbox的数据。 我如何获得如 pysot、pytracking这些代码库中对于具体数据集的测试指标数据(mAO、Norm、Prec等)。 我不太明白,我想我需要一些帮助

mm-assistant[bot] commented 1 year ago

We recommend using English or English & Chinese for issues so that we could have broader discussion.

MonsterZUO commented 1 year ago

另外:

在 data/got10k/annotations 中有 3 个 JSON 文件: got10k_train.json: 包含 GOT10k 训练集标注信息的 JSON 文件。 got10k_test.json: 包含 GOT10k 测试集标注信息的 JSON 文件。 got10k_val.json: 包含 GOT10k 验证集标注信息的 JSON 文件。

GOT10K, LASOT, TRACKINGNET这些数据集中的这些json文件我该怎么获得,我发现py文件只生成txt文件

MonsterZUO commented 1 year ago

when I try to test stark on got10k

./configs/sot/stark/stark_st2_r50_50e_got10k.py --checkpoint work_dirs/xxx/latest.pth --out results/stark_st2_got_ori/result_4.pkl --eval track

error

/root/miniconda3/envs/open-mmlab/bin/python /root/autodl-tmp/mmtracking/tools/test.py ./configs/sot/stark/stark_st2_r50_50e_got10k.py --checkpoint work_dirs/xxx/latest.pth --out results/stark_st2_got_ori/result_4.pkl --eval track /root/miniconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/init.py:21: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details. 'On January 1, 2023, MMCV will release v2.0.0, in which it will remove ' Error importing BURST due to missing underlying dependency: No module named 'tabulate' Loading GOT10k dataset... GOT10k dataset loaded! (0.00 s) /root/autodl-tmp/mmtracking/mmtrack/core/utils/misc.py:27: UserWarning: Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. f'Setting OMP_NUM_THREADS environment variable for each process ' /root/autodl-tmp/mmtracking/mmtrack/core/utils/misc.py:37: UserWarning: Setting MKL_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. f'Setting MKL_NUM_THREADS environment variable for each process ' /root/miniconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/cnn/bricks/conv_module.py:154: UserWarning: Unnecessary conv bias before batch/instance norm 'Unnecessary conv bias before batch/instance norm') 2023-03-05 22:20:41,517 - mmcv - INFO - initialize ResNet with init_cfg {'type': 'Pretrained', 'checkpoint': 'torchvision://resnet50'} 2023-03-05 22:20:41,517 - mmcv - INFO - load model from: torchvision://resnet50 2023-03-05 22:20:41,518 - mmcv - INFO - load checkpoint from torchvision path: torchvision://resnet50 2023-03-05 22:20:41,937 - mmcv - WARNING - The model and loaded state dict do not match exactly

unexpected key in source state_dict: layer4.0.conv1.weight, layer4.0.bn1.running_mean, layer4.0.bn1.running_var, layer4.0.bn1.weight, layer4.0.bn1.bias, layer4.0.conv2.weight, layer4.0.bn2.running_mean, layer4.0.bn2.running_var, layer4.0.bn2.weight, layer4.0.bn2.bias, layer4.0.conv3.weight, layer4.0.bn3.running_mean, layer4.0.bn3.running_var, layer4.0.bn3.weight, layer4.0.bn3.bias, layer4.0.downsample.0.weight, layer4.0.downsample.1.running_mean, layer4.0.downsample.1.running_var, layer4.0.downsample.1.weight, layer4.0.downsample.1.bias, layer4.1.conv1.weight, layer4.1.bn1.running_mean, layer4.1.bn1.running_var, layer4.1.bn1.weight, layer4.1.bn1.bias, layer4.1.conv2.weight, layer4.1.bn2.running_mean, layer4.1.bn2.running_var, layer4.1.bn2.weight, layer4.1.bn2.bias, layer4.1.conv3.weight, layer4.1.bn3.running_mean, layer4.1.bn3.running_var, layer4.1.bn3.weight, layer4.1.bn3.bias, layer4.2.conv1.weight, layer4.2.bn1.running_mean, layer4.2.bn1.running_var, layer4.2.bn1.weight, layer4.2.bn1.bias, layer4.2.conv2.weight, layer4.2.bn2.running_mean, layer4.2.bn2.running_var, layer4.2.bn2.weight, layer4.2.bn2.bias, layer4.2.conv3.weight, layer4.2.bn3.running_mean, layer4.2.bn3.running_var, layer4.2.bn3.weight, layer4.2.bn3.bias, fc.weight, fc.bias

load checkpoint from local path: work_dirs/xxx/latest.pth [>>>>>>>>>>>>>>>>>>>>>>>] 22834/22834, 21.0 task/s, elapsed: 1089s, ETA: 0s writing results to results/stark_st2_got_ori/result_4.pkl Evaluate OPE Benchmark... Traceback (most recent call last): File "/root/autodl-tmp/mmtracking/tools/test.py", line 226, in main() File "/root/autodl-tmp/mmtracking/tools/test.py", line 216, in main metric = dataset.evaluate(outputs, **eval_kwargs) File "/root/autodl-tmp/mmtracking/mmtrack/datasets/base_sot_dataset.py", line 325, in evaluate visible_infos=visible_infos) File "/root/autodl-tmp/mmtracking/mmtrack/core/evaluation/eval_sot_ope.py", line 90, in eval_sot_ope assert len(pred_bboxes) == len(single_video_gt_bboxes) AssertionError

config print

Config: cudnn_benchmark = True deterministic = True seed = 1 model = dict( type='Stark', backbone=dict( type='ResNet', depth=50, num_stages=3, strides=(1, 2, 2), dilations=[1, 1, 1], out_indices=[2], frozen_stages=1, norm_eval=True, norm_cfg=dict(type='BN', requires_grad=False), init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50')), neck=dict( type='ChannelMapper', in_channels=[1024], out_channels=256, kernel_size=1, act_cfg=None), head=dict( type='StarkHead', num_querys=1, transformer=dict( type='StarkTransformer', encoder=dict( type='DetrTransformerEncoder', num_layers=6, transformerlayers=dict( type='BaseTransformerLayer', attn_cfgs=[ dict( type='MultiheadAttention', embed_dims=256, num_heads=8, attn_drop=0.1, dropout_layer=dict(type='Dropout', drop_prob=0.1)) ], ffn_cfgs=dict( feedforward_channels=2048, embed_dims=256, ffn_drop=0.1), operation_order=('self_attn', 'norm', 'ffn', 'norm'))), decoder=dict( type='DetrTransformerDecoder', return_intermediate=False, num_layers=6, transformerlayers=dict( type='BaseTransformerLayer', attn_cfgs=dict( type='MultiheadAttention', embed_dims=256, num_heads=8, attn_drop=0.1, dropout_layer=dict(type='Dropout', drop_prob=0.1)), ffn_cfgs=dict( feedforward_channels=2048, embed_dims=256, ffn_drop=0.1), operation_order=('self_attn', 'norm', 'cross_attn', 'norm', 'ffn', 'norm')))), positional_encoding=dict( type='SinePositionalEncoding', num_feats=128, normalize=True), bbox_head=dict( type='CornerPredictorHead', inplanes=256, channel=256, feat_size=20, stride=16), loss_bbox=dict(type='L1Loss', loss_weight=5.0), loss_iou=dict(type='GIoULoss', loss_weight=2.0), cls_head=dict( type='ScoreHead', input_dim=256, hidden_dim=256, output_dim=1, num_layers=3, use_bn=False), frozen_modules=['transformer', 'bbox_head', 'query_embedding'], loss_cls=dict(type='CrossEntropyLoss', use_sigmoid=True)), test_cfg=dict( search_factor=5.0, search_size=320, template_factor=2.0, template_size=128, update_intervals=[200]), frozen_modules=['backbone', 'neck']) data_root = 'data/' train_pipeline = [ dict( type='TridentSampling', num_search_frames=1, num_template_frames=2, max_frame_range=[200], cls_pos_prob=0.5, train_cls_head=True), dict(type='LoadMultiImagesFromFile', to_float32=True), dict(type='SeqLoadAnnotations', with_bbox=True, with_label=True), dict(type='SeqGrayAug', prob=0.05), dict( type='SeqRandomFlip', share_params=True, flip_ratio=0.5, direction='horizontal'), dict( type='SeqBboxJitter', center_jitter_factor=[0, 0, 4.5], scale_jitter_factor=[0, 0, 0.5], crop_size_factor=[2, 2, 5]), dict( type='SeqCropLikeStark', crop_size_factor=[2, 2, 5], output_size=[128, 128, 320]), dict(type='SeqBrightnessAug', jitter_range=0.2), dict( type='SeqRandomFlip', share_params=False, flip_ratio=0.5, direction='horizontal'), dict( type='SeqNormalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='CheckPadMaskValidity', stride=16), dict( type='VideoCollect', keys=['img', 'gt_bboxes', 'gt_labels', 'padding_mask'], meta_keys='valid'), dict(type='ConcatSameTypeFrames', num_key_frames=2), dict(type='SeqDefaultFormatBundle', ref_prefix='search') ] img_norm_cfg = dict(mean=[0, 0, 0], std=[1, 1, 1], to_rgb=True) test_pipeline = [ dict(type='LoadImageFromFile', to_float32=True), dict(type='LoadAnnotations', with_bbox=True, with_label=False), dict( type='MultiScaleFlipAug', scale_factor=1, flip=False, transforms=[ dict(type='Normalize', mean=[0, 0, 0], std=[1, 1, 1], to_rgb=True), dict(type='VideoCollect', keys=['img', 'gt_bboxes']), dict(type='ImageToTensor', keys=['img']) ]) ] data = dict( samples_per_gpu=256, workers_per_gpu=9, persistent_workers=True, samples_per_epoch=60000, train=dict( type='RandomSampleConcatDataset', dataset_sampling_weights=[1], dataset_cfgs=[ dict( type='GOT10kDataset', ann_file='data/got10k/annotations/got10k_train_infos.txt', img_prefix='data/got10k', pipeline=[ dict( type='TridentSampling', num_search_frames=1, num_template_frames=2, max_frame_range=[200], cls_pos_prob=0.5, train_cls_head=True), dict(type='LoadMultiImagesFromFile', to_float32=True), dict( type='SeqLoadAnnotations', with_bbox=True, with_label=True), dict(type='SeqGrayAug', prob=0.05), dict( type='SeqRandomFlip', share_params=True, flip_ratio=0.5, direction='horizontal'), dict( type='SeqBboxJitter', center_jitter_factor=[0, 0, 4.5], scale_jitter_factor=[0, 0, 0.5], crop_size_factor=[2, 2, 5]), dict( type='SeqCropLikeStark', crop_size_factor=[2, 2, 5], output_size=[128, 128, 320]), dict(type='SeqBrightnessAug', jitter_range=0.2), dict( type='SeqRandomFlip', share_params=False, flip_ratio=0.5, direction='horizontal'), dict( type='SeqNormalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='CheckPadMaskValidity', stride=16), dict( type='VideoCollect', keys=['img', 'gt_bboxes', 'gt_labels', 'padding_mask'], meta_keys='valid'), dict(type='ConcatSameTypeFrames', num_key_frames=2), dict(type='SeqDefaultFormatBundle', ref_prefix='search') ], split='train', test_mode=False) ]), val=dict( type='GOT10kDataset', ann_file='data/got10k/annotations/got10k_test_infos.txt', img_prefix='data/got10k', pipeline=[ dict(type='LoadImageFromFile', to_float32=True), dict(type='LoadAnnotations', with_bbox=True, with_label=False), dict( type='MultiScaleFlipAug', scale_factor=1, flip=False, transforms=[ dict( type='Normalize', mean=[0, 0, 0], std=[1, 1, 1], to_rgb=True), dict(type='VideoCollect', keys=['img', 'gt_bboxes']), dict(type='ImageToTensor', keys=['img']) ]) ], split='test', test_mode=True), test=dict( type='GOT10kDataset', ann_file='data/got10k/annotations/got10k_test_infos.txt', img_prefix='data/got10k', pipeline=[ dict(type='LoadImageFromFile', to_float32=True), dict(type='LoadAnnotations', with_bbox=True, with_label=False), dict( type='MultiScaleFlipAug', scale_factor=1, flip=False, transforms=[ dict( type='Normalize', mean=[0, 0, 0], std=[1, 1, 1], to_rgb=True), dict(type='VideoCollect', keys=['img', 'gt_bboxes']), dict(type='ImageToTensor', keys=['img']) ]) ], split='test', test_mode=True)) optimizer = dict( type='AdamW', lr=0.0001, weight_decay=0.0001, paramwise_cfg=dict( custom_keys=dict(backbone=dict(lr_mult=0.1, decay_mult=1.0)))) optimizer_config = dict(grad_clip=dict(max_norm=0.1, norm_type=2)) lr_config = dict(policy='step', step=[40]) checkpoint_config = dict(interval=10) evaluation = dict( metric=['track'], interval=100, start=51, rule='greater', save_best='success') log_config = dict( interval=50, hooks=[ dict(type='TextLoggerHook'), dict( type='WandbLoggerHook', by_epoch=False, init_kwargs=dict(entity='huangyz-cv', project='stark')) ]) total_epochs = 50 dist_params = dict(backend='nccl') log_level = 'INFO' work_dir = './work_dirs/stark_st2_origin' load_from = 'logs/stark_st1_got10k_online/epoch_500.pth' resume_from = None workflow = [('train', 1)]

dataset structurestructure

image
zhangrujia commented 1 year ago

Hi, have you solved the problem? I met the same error when I tried to test stark on the Trackingnet.

MonsterZUO commented 1 year ago

Hi, have you solved the problem? I met the same error when I tried to test stark on the Trackingnet.

not yet. and I switch to pysot for my future study