open-mmlab / mmaction2

OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
https://mmaction2.readthedocs.io
Apache License 2.0
4.3k stars 1.25k forks source link

How to use custom dataset for action recognition task? #2659

Open luciferasura opened 1 year ago

luciferasura commented 1 year ago

Branch

main branch (1.x version, such as v1.0.0, or dev-1.x branch)

Prerequisite

Environment

System environment: sys.platform: linux Python: 3.8.17 (default, Jul 5 2023, 21:04:15) [GCC 11.2.0] CUDA available: True numpy_random_seed: 738065409 GPU 0: NVIDIA GeForce RTX 3090 CUDA_HOME: /usr/local/cuda NVCC: Cuda compilation tools, release 11.0, V11.0.221 GCC: gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 PyTorch: 1.7.0 PyTorch compiling details: PyTorch built with:

Runtime environment: cudnn_benchmark: False mp_cfg: {'mp_start_method': 'fork', 'opencv_num_threads': 0} dist_cfg: {'backend': 'nccl'} seed: 738065409 diff_rank_seed: False deterministic: False Distributed launcher: none Distributed training: False GPU number: 1

Describe the bug

Traceback (most recent call last): File "tools/train.py", line 135, in main() File "tools/train.py", line 131, in main runner.train() File "/usr/local/miniconda3/envs/mmlab/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1703, in train self._train_loop = self.build_train_loop( File "/usr/local/miniconda3/envs/mmlab/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1495, in build_train_loop loop = LOOPS.build( File "/usr/local/miniconda3/envs/mmlab/lib/python3.8/site-packages/mmengine/registry/registry.py", line 570, in build return self.build_func(cfg, args, kwargs, registry=self) File "/usr/local/miniconda3/envs/mmlab/lib/python3.8/site-packages/mmengine/registry/build_functions.py", line 121, in build_from_cfg obj = obj_cls(args) # type: ignore File "/usr/local/miniconda3/envs/mmlab/lib/python3.8/site-packages/mmengine/runner/loops.py", line 44, in init super().init(runner, dataloader) File "/usr/local/miniconda3/envs/mmlab/lib/python3.8/site-packages/mmengine/runner/base_loop.py", line 26, in init self.dataloader = runner.build_dataloader( File "/usr/local/miniconda3/envs/mmlab/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1353, in build_dataloader dataset = DATASETS.build(dataset_cfg) File "/usr/local/miniconda3/envs/mmlab/lib/python3.8/site-packages/mmengine/registry/registry.py", line 570, in build return self.build_func(cfg, args, kwargs, registry=self) File "/usr/local/miniconda3/envs/mmlab/lib/python3.8/site-packages/mmengine/registry/build_functions.py", line 121, in build_from_cfg obj = obj_cls(args) # type: ignore File "/mmaction2/mmaction/datasets/video_dataset.py", line 67, in init super().init( File "/mmaction2/mmaction/datasets/base.py", line 48, in init super().init( File "/usr/local/miniconda3/envs/mmlab/lib/python3.8/site-packages/mmengine/dataset/base_dataset.py", line 245, in init self.full_init() File "/usr/local/miniconda3/envs/mmlab/lib/python3.8/site-packages/mmengine/dataset/base_dataset.py", line 296, in full_init self.data_list = self.load_data_list() File "/mmaction2/mmaction/datasets/video_dataset.py", line 93, in load_data_list filename, label = line_split ValueError: too many values to unpack (expected 2)

Reproduces the problem - code sample

The following is the complete config code: ann_file_train = '/mmaction2/classroomactionvideo/train.txt' ann_file_val = '/mmaction2/classroomactionvideo/val.txt' auto_scale_lr = dict(base_batch_size=256, enable=False) data_root = '/mmaction2/classroomactionvideo/train/' data_root_val = '/mmaction2/classroomactionvideo/val/' dataset_type = 'VideoDataset' default_hooks = dict( checkpoint=dict(interval=1, save_best='auto', type='CheckpointHook'), logger=dict(ignore_last=False, interval=20, type='LoggerHook'), param_scheduler=dict(type='ParamSchedulerHook'), runtime_info=dict(type='RuntimeInfoHook'), sampler_seed=dict(type='DistSamplerSeedHook'), sync_buffers=dict(type='SyncBuffersHook'), timer=dict(type='IterTimerHook')) default_scope = 'mmaction' env_cfg = dict( cudnn_benchmark=False, dist_cfg=dict(backend='nccl'), mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0)) file_client_args = dict(io_backend='disk') launcher = 'none' load_from = None log_level = 'INFO' log_processor = dict(by_epoch=True, type='LogProcessor', window_size=20) model = dict( backbone=dict( depth=50, norm_eval=False, pretrained='https://download.pytorch.org/models/resnet50-11ad3fa6.pth', type='ResNet'), cls_head=dict( average_clips='prob', consensus=dict(dim=1, type='AvgConsensus'), dropout_ratio=0.4, in_channels=2048, init_std=0.01, num_classes=7, spatial_type='avg', type='TSNHead'), data_preprocessor=dict( format_shape='NCHW', mean=[ 123.675, 116.28, 103.53, ], std=[ 58.395, 57.12, 57.375, ], type='ActionDataPreprocessor'), test_cfg=None, train_cfg=None, type='Recognizer2D') optim_wrapper = dict( clip_grad=dict(max_norm=40, norm_type=2), optimizer=dict(lr=0.005, momentum=0.9, type='SGD', weight_decay=0.0001)) param_scheduler = [ dict( begin=0, by_epoch=True, end=50, gamma=0.1, milestones=[ 20, 40, ], type='MultiStepLR'), ] randomness = dict(deterministic=False, diff_rank_seed=False, seed=None) resume = False test_cfg = dict(type='TestLoop') test_dataloader = dict( batch_size=1, dataset=dict( ann_file='/mmaction2/classroomactionvideo/val.txt', data_prefix=dict(video='/mmaction2/classroomactionvideo/val/'), pipeline=[ dict(io_backend='disk', type='DecordInit'), dict( clip_len=1, frame_interval=1, num_clips=25, test_mode=True, type='SampleFrames'), dict(type='DecordDecode'), dict(scale=( -1, 256, ), type='Resize'), dict(crop_size=224, type='TenCrop'), dict(input_format='NCHW', type='FormatShape'), dict(type='PackActionInputs'), ], test_mode=True, type='VideoDataset'), num_workers=2, persistent_workers=True, sampler=dict(shuffle=False, type='DefaultSampler')) test_evaluator = dict(type='AccMetric') test_pipeline = [ dict(io_backend='disk', type='DecordInit'), dict( clip_len=1, frame_interval=1, num_clips=25, test_mode=True, type='SampleFrames'), dict(type='DecordDecode'), dict(scale=( -1, 256, ), type='Resize'), dict(crop_size=224, type='TenCrop'), dict(input_format='NCHW', type='FormatShape'), dict(type='PackActionInputs'), ] train_cfg = dict( max_epochs=50, type='EpochBasedTrainLoop', val_begin=1, val_interval=1) train_dataloader = dict( batch_size=32, dataset=dict( ann_file='/mmaction2/classroomactionvideo/train.txt', data_prefix=dict(video='/mmaction2/classroomactionvideo/train/'), pipeline=[ dict(io_backend='disk', type='DecordInit'), dict( clip_len=1, frame_interval=1, num_clips=3, type='SampleFrames'), dict(type='DecordDecode'), dict(scale=( -1, 256, ), type='Resize'), dict( input_size=224, max_wh_scale_gap=1, random_crop=False, scales=( 1, 0.875, 0.75, 0.66, ), type='MultiScaleCrop'), dict(keep_ratio=False, scale=( 224, 224, ), type='Resize'), dict(flip_ratio=0.5, type='Flip'), dict(input_format='NCHW', type='FormatShape'), dict(type='PackActionInputs'), ], type='VideoDataset'), num_workers=2, persistent_workers=True, sampler=dict(shuffle=True, type='DefaultSampler')) train_pipeline = [ dict(io_backend='disk', type='DecordInit'), dict(clip_len=1, frame_interval=1, num_clips=3, type='SampleFrames'), dict(type='DecordDecode'), dict(scale=( -1, 256, ), type='Resize'), dict( input_size=224, max_wh_scale_gap=1, random_crop=False, scales=( 1, 0.875, 0.75, 0.66, ), type='MultiScaleCrop'), dict(keep_ratio=False, scale=( 224, 224, ), type='Resize'), dict(flip_ratio=0.5, type='Flip'), dict(input_format='NCHW', type='FormatShape'), dict(type='PackActionInputs'), ] val_cfg = dict(type='ValLoop') val_dataloader = dict( batch_size=32, dataset=dict( ann_file='/mmaction2/classroomactionvideo/val.txt', data_prefix=dict(video='/mmaction2/classroomactionvideo/val/'), pipeline=[ dict(io_backend='disk', type='DecordInit'), dict( clip_len=1, frame_interval=1, num_clips=3, test_mode=True, type='SampleFrames'), dict(type='DecordDecode'), dict(scale=( -1, 256, ), type='Resize'), dict(crop_size=224, type='CenterCrop'), dict(input_format='NCHW', type='FormatShape'), dict(type='PackActionInputs'), ], test_mode=True, type='VideoDataset'), num_workers=2, persistent_workers=True, sampler=dict(shuffle=False, type='DefaultSampler')) val_evaluator = dict(type='AccMetric') val_pipeline = [ dict(io_backend='disk', type='DecordInit'), dict( clip_len=1, frame_interval=1, num_clips=3, test_mode=True, type='SampleFrames'), dict(type='DecordDecode'), dict(scale=( -1, 256, ), type='Resize'), dict(crop_size=224, type='CenterCrop'), dict(input_format='NCHW', type='FormatShape'), dict(type='PackActionInputs'), ] vis_backends = [ dict(type='LocalVisBackend'), ] visualizer = dict( type='ActionVisualizer', vis_backends=[ dict(type='LocalVisBackend'), ]) work_dir = './work_dirs/tsn_imagenet-pretrained-r50_8xb32-1x1x3-100e_classroom'

Reproduces the problem - command or script

python tools/train.py checkpiont/tsn_imagenet-pretrained-r50_8xb32-1x1x3-100e_classroom.py

Reproduces the problem - error message

Traceback (most recent call last): File "tools/train.py", line 135, in main() File "tools/train.py", line 131, in main runner.train() File "/usr/local/miniconda3/envs/mmlab/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1703, in train self._train_loop = self.build_train_loop( File "/usr/local/miniconda3/envs/mmlab/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1495, in build_train_loop loop = LOOPS.build( File "/usr/local/miniconda3/envs/mmlab/lib/python3.8/site-packages/mmengine/registry/registry.py", line 570, in build return self.build_func(cfg, args, kwargs, registry=self) File "/usr/local/miniconda3/envs/mmlab/lib/python3.8/site-packages/mmengine/registry/build_functions.py", line 121, in build_from_cfg obj = obj_cls(args) # type: ignore File "/usr/local/miniconda3/envs/mmlab/lib/python3.8/site-packages/mmengine/runner/loops.py", line 44, in init super().init(runner, dataloader) File "/usr/local/miniconda3/envs/mmlab/lib/python3.8/site-packages/mmengine/runner/base_loop.py", line 26, in init self.dataloader = runner.build_dataloader( File "/usr/local/miniconda3/envs/mmlab/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1353, in build_dataloader dataset = DATASETS.build(dataset_cfg) File "/usr/local/miniconda3/envs/mmlab/lib/python3.8/site-packages/mmengine/registry/registry.py", line 570, in build return self.build_func(cfg, args, kwargs, registry=self) File "/usr/local/miniconda3/envs/mmlab/lib/python3.8/site-packages/mmengine/registry/build_functions.py", line 121, in build_from_cfg obj = obj_cls(args) # type: ignore File "/mmaction2/mmaction/datasets/video_dataset.py", line 67, in init super().init( File "/mmaction2/mmaction/datasets/base.py", line 48, in init super().init( File "/usr/local/miniconda3/envs/mmlab/lib/python3.8/site-packages/mmengine/dataset/base_dataset.py", line 245, in init self.full_init() File "/usr/local/miniconda3/envs/mmlab/lib/python3.8/site-packages/mmengine/dataset/base_dataset.py", line 296, in full_init self.data_list = self.load_data_list() File "/mmaction2/mmaction/datasets/video_dataset.py", line 93, in load_data_list filename, label = line_split ValueError: too many values to unpack (expected 2)

Additional information

dataset path --classroomactionvideo --train --val --train.txt --val.txt

train.txt and val.txt typography here is similar to the multi-line drinking (1).mp4 1.I am a beginner and am now learning the model fine-tuning of the user guide in MMAction2, in which I want to train action recognition with my own dataset videodataset, and report an error after running ValueError: too many values to unpack (expected 2).I would like to ask what is wrong and How to correct thanks.(我是一个入门小白,现在在学习MMAction2中的用户指南的模型微调,其中我用自己的数据集videodataset想训练一下动作识别,运行之后报错ValueError: too many values to unpack (expected 2),请教一下哪里错了,如何改正谢谢)

cir7 commented 1 year ago

The error message indicates that there are too many values for each annotation line. Please check and make sure that your annotation file has the same format as follows:

some/path/000.mp4 1
some/path/001.mp4 1
...

you can refer to the doc for details

luciferasura commented 1 year ago

@cir7 Is it similar to classroomvideodataset/train/001.mp4 1?,after I changed it to this, I still reported the original error

cir7 commented 1 year ago

could you paste part of your annotation here?

luciferasura commented 1 year ago

Okay, please see where I went wrong. train drinking (31).mp4 1 drinking (32).mp4 1 drinking (33).mp4 1 drinking (34).mp4 1 ....... lecture (31).mp4 2 lecture (32).mp4 2 lecture (33).mp4 2 lecture (34).mp4 2 .....

train.txt classroomvideo/train/drinking (31).mp4 1 classroomvideo/train/drinking (32).mp4 1 classroomvideo/train/drinking (33).mp4 1 classroomvideo/train/drinking (34).mp4 1 ....... classroomvideo/train/lecture (31).mp4 2 classroomvideo/train/lecture (32).mp4 2 classroomvideo/train/lecture (33).mp4 2 classroomvideo/train/lecture (34).mp4 2 .....

chatGPT tells me the correct format is as follows, right? dataset/ ├── train/ │ ├── class1/ │ │ ├── video1.mp4 │ │ ├── video2.mp4 │ │ └── ... │ ├── class2/
│ │ ├── video3.mp4 │ │ ├── video4.mp4 │ │ └── ... │ └── ... ├── val/ │ ├── class1/ │ │ ├── video5.mp4 │ │ ├── video6.mp4 │ │ └── ... │ ├── class2/ │ │ ├── video7.mp4
│ │ ├── video8.mp4 │ │ └── ... │ └── ... ├── train.txt ├── val.txt

train.txt

dataset/train/class1/video1.mp4 0 dataset/train/class1/video2.mp4 0 ... dataset/train/class2/video3.mp4 1 dataset/train/class2/video4.mp4 1 ...

val.txt

dataset/val/class1/video5.mp4 0
dataset/val/class1/video6.mp4 0 ... dataset/val/class2/video7.mp4 1 dataset/val/class2/video8.mp4 1 ... @cir7 Thank you for your reply

cir7 commented 1 year ago

There are whitespace characters in your video file name, and we default to use it as a split delimiter, which results in more values than expected. I suggest removing the whitespace in the video file name. Another way is to use , as the delimiter and modify the config file and your annotation accordingly.

luciferasura commented 1 year ago

I followed your advice and I just removed all the spaces, but after running, I always report the error of not finding the MP4 file,and the following is a new error.For example :FileNotFoundError: [Errno 2] No such file or directory: '/mmaction2/mmaction/datasets/classvideo/train/play_phone(2).mp4'

the following is my train.txt: train/drinking(1).mp4 1 train/drinking(2).mp4 1 train/drinking(3).mp4 1 .... train/lecture(1).mp4 2 train/lecture(2).mp4 2 train/lecture(3).mp4 2 ....

and the following is a new error. Loads checkpoint by http backend from path: https://download.pytorch.org/models/resnet50-11ad3fa6.pth 08/30 17:03:18 - mmengine - INFO - These parameters in pretrained checkpoint are not loaded: {'fc.bias', 'fc.weight'} 08/30 17:03:19 - mmengine - WARNING - "FileClient" will be deprecated in future. Please use io functions in https://mmengine.readthedocs.io/en/latest/api/fileio.html#file-io 08/30 17:03:19 - mmengine - WARNING - "HardDiskBackend" is the alias of "LocalBackend" and the former will be deprecated in future. 08/30 17:03:19 - mmengine - INFO - Checkpoints will be saved to /mmaction2/work_dirs/tsn_imagenet-pretrained-r50_8xb32-1x1x3-100e_classroom. Traceback (most recent call last): File "tools/train.py", line 135, in main() File "tools/train.py", line 131, in main runner.train() File "/usr/local/miniconda3/envs/mmlab/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1745, in train model = self.train_loop.run() # type: ignore File "/usr/local/miniconda3/envs/mmlab/lib/python3.8/site-packages/mmengine/runner/loops.py", line 96, in run self.run_epoch() File "/usr/local/miniconda3/envs/mmlab/lib/python3.8/site-packages/mmengine/runner/loops.py", line 111, in run_epoch for idx, data_batch in enumerate(self.dataloader): File "/usr/local/miniconda3/envs/mmlab/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 435, in next data = self._next_data() File "/usr/local/miniconda3/envs/mmlab/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1085, in _next_data return self._process_data(data) File "/usr/local/miniconda3/envs/mmlab/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1111, in _process_data data.reraise() File "/usr/local/miniconda3/envs/mmlab/lib/python3.8/site-packages/torch/_utils.py", line 428, in reraise raise self.exc_type(msg) FileNotFoundError: Caught FileNotFoundError in DataLoader worker process 0. Original Traceback (most recent call last): File "/usr/local/miniconda3/envs/mmlab/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 198, in _worker_loop data = fetcher.fetch(index) File "/usr/local/miniconda3/envs/mmlab/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/usr/local/miniconda3/envs/mmlab/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/usr/local/miniconda3/envs/mmlab/lib/python3.8/site-packages/mmengine/dataset/base_dataset.py", line 408, in getitem data = self.prepare_data(idx) File "/usr/local/miniconda3/envs/mmlab/lib/python3.8/site-packages/mmengine/dataset/base_dataset.py", line 790, in prepare_data return self.pipeline(data_info) File "/usr/local/miniconda3/envs/mmlab/lib/python3.8/site-packages/mmengine/dataset/base_dataset.py", line 58, in call data = t(data) File "/usr/local/miniconda3/envs/mmlab/lib/python3.8/site-packages/mmcv/transforms/base.py", line 12, in call return self.transform(results) File "/mmaction2/mmaction/datasets/transforms/loading.py", line 1155, in transform container = self._get_video_reader(results['filename']) File "/mmaction2/mmaction/datasets/transforms/loading.py", line 1142, in _get_video_reader file_obj = io.BytesIO(self.file_client.get(filename)) File "/usr/local/miniconda3/envs/mmlab/lib/python3.8/site-packages/mmengine/fileio/file_client.py", line 301, in get return self.client.get(filepath) File "/usr/local/miniconda3/envs/mmlab/lib/python3.8/site-packages/mmengine/fileio/backends/local_backend.py", line 33, in get with open(filepath, 'rb') as f: FileNotFoundError: [Errno 2] No such file or directory: '/mmaction2/mmaction/datasets/classvideo/train/play_phone(2).mp4'

cir7 commented 1 year ago

please make sure the file exist: /mmaction2/mmaction/datasets/classvideo/train/play_phone(2).mp4, maybe you forget to rename video files?

luciferasura commented 1 year ago

I renamed each file name by hand and I also rechecked the filename and removed the spaces.In addition to modifying your own config, is there anything else that needs to be modified, the official tutorial reference is not detailed, can you guide it?