open-mmlab / mmaction2

OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
https://mmaction2.readthedocs.io
Apache License 2.0
4.14k stars 1.22k forks source link

r2plus1d #1511

Closed WEIZHIHONG720 closed 2 years ago

WEIZHIHONG720 commented 2 years ago

Hi! In the r2plus1d_r34_32x2x1_180e_kinetics400_rgb config, if I want to modify the number of dataset`s categories and load the pre-trained model, how do I do it?

kennymckormick commented 2 years ago

You can just modify the num_classes in cls_head definition.

WEIZHIHONG720 commented 2 years ago

emm... @kennymckormick (;′⌒) Thank you ~ But I dont know why i meet the problem: AssertionError: Default process group is not initialized I just modified the dataset_type = 'VideoDataset'

Here are my config:

base = [ '../../base/models/r2plus1d_r34.py', '../../base/default_runtime.py' ]

dataset_type = 'VideoDataset' data_root = '/home/data1/wzh//labeled-train/Train/' data_root_val = '/home/data1/wzh//labeled-train/Train/'

ann_file_train = '/home/wzh21/mmaction2-master/tools/data/kinetics/fold/train_fold_0.txt' ann_file_val = '/home/wzh21/mmaction2-master/tools/data/kinetics/fold/val_fold_0.txt' ann_file_test = '/home/wzh21/mmaction2-master/tools/data/kinetics/fold/val_fold_0.txt'

img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_bgr=False) train_pipeline = [ dict(type='DecordInit'), dict(type='SampleFrames', clip_len=8, frame_interval=8, num_clips=1), dict(type='DecordDecode'), dict(type='Resize', scale=(-1, 256)), dict(type='RandomResizedCrop'), dict(type='Resize', scale=(224, 224), keep_ratio=False), dict(type='Flip', flip_ratio=0.5), dict(type='Normalize', img_norm_cfg), dict(type='FormatShape', input_format='NCTHW'), dict(type='Collect', keys=['imgs', 'label'], meta_keys=[]), dict(type='ToTensor', keys=['imgs', 'label']) ] val_pipeline = [ dict(type='DecordInit'), dict( type='SampleFrames', clip_len=8, frame_interval=8, num_clips=1, test_mode=True), dict(type='DecordDecode'), dict(type='Resize', scale=(-1, 256)), dict(type='CenterCrop', crop_size=224), dict(type='Normalize', img_norm_cfg), dict(type='FormatShape', input_format='NCTHW'), dict(type='Collect', keys=['imgs', 'label'], meta_keys=[]), dict(type='ToTensor', keys=['imgs']) ] test_pipeline = [ dict(type='DecordInit'), dict( type='SampleFrames', clip_len=8, frame_interval=8, num_clips=10, test_mode=True), dict(type='DecordDecode'), dict(type='Resize', scale=(-1, 256)), dict(type='ThreeCrop', crop_size=256), dict(type='Normalize', **img_norm_cfg), dict(type='FormatShape', input_format='NCTHW'), dict(type='Collect', keys=['imgs', 'label'], meta_keys=[]), dict(type='ToTensor', keys=['imgs']) ] data = dict( videos_per_gpu=8, workers_per_gpu=2, test_dataloader=dict(videos_per_gpu=1), train=dict( type=dataset_type, ann_file=ann_file_train, data_prefix=data_root, pipeline=train_pipeline), val=dict( type=dataset_type, ann_file=ann_file_val, data_prefix=data_root_val, pipeline=val_pipeline, test_mode=True), test=dict( type=dataset_type, ann_file=ann_file_val, data_prefix=data_root_val, pipeline=test_pipeline, test_mode=True)) evaluation = dict( interval=5, metrics=['top_k_accuracy', 'mean_class_accuracy'])

optimizer = dict( type='SGD', lr=0.1, momentum=0.9, weight_decay=0.0001) # this lr is used for 8 gpus optimizer_config = dict(grad_clip=dict(max_norm=40, norm_type=2))

lr_config = dict(policy='CosineAnnealing', min_lr=0) total_epochs = 180

checkpoint_config = dict(interval=5) work_dir = './work_dirs/r2plus1d_r34_8x8x1_180e_kinetics400_rgb/' find_unused_parameters = False

kennymckormick commented 2 years ago

use distributed training: ./tools/dist_train.sh {config} {num_gpus} {other_args}