open-mmlab / mmpretrain

OpenMMLab Pre-training Toolbox and Benchmark
https://mmpretrain.readthedocs.io/en/latest/
Apache License 2.0
3.44k stars 1.06k forks source link

[Bug] Problem while using CustomDataset #1836

Open TranTriDat opened 11 months ago

TranTriDat commented 11 months ago

Branch

main branch (mmpretrain version)

Describe the bug

base = [ '../base/models/resnet50.py', # model settings '../base/schedules/imagenet_bs256.py', # schedule settings '../base/default_runtime.py' ]

model = dict( type='ImageClassifier', # The type of the main model (here is for image classification task). backbone=dict( type='ResNet', # The type of the backbone module. depth=50, num_stages=4, out_indices=(3, ), frozen_stages=-1, style='pytorch'), neck=dict(type='GlobalAveragePooling'), # The type of the neck module. head=dict( type='LinearClsHead', # The type of the classification head module. num_classes=3, in_channels=2048, loss=dict(type='CrossEntropyLoss', loss_weight=1.0), ))

dataset_type = 'CustomDataset' img_norm_cfg = dict( mean=[124.508, 116.050, 106.438], std=[58.577, 57.310, 57.437], to_rgb=False) train_pipeline = [ dict(type='LoadImageFromFile'), dict(type='RandomResizedCrop', scale=100, backend='pillow'), dict(type='Normalize', img_norm_cfg), dict(type='ImageToTensor', keys=['img']), dict(type='ToTensor', keys=['gt_label']), dict(type='Collect', keys=['img', 'gt_label']) ] test_pipeline = [ dict(type='LoadImageFromFile'), dict(type='Resize', scale=(100, -1), backend='pillow'), dict(type='CenterCrop', crop_size=100), dict(type='Normalize', img_norm_cfg), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]

train_dataloader = dict( batch_size=32, num_workers=2, dataset=dict( type='CustomDataset', data_prefix='data/YoriDataset_vgg/train/AXIAL', classes='data/classes.txt', ann_file='data/train_ann.txt', pipeline=train_pipeline
), )

val_dataloader = dict( batch_size=32, num_workers=2, dataset=dict( type='CustomDataset', data_prefix='data/YoriDataset_vgg/validation/AXIAL', classes='data/classes.txt', ann_file='data/val_ann.txt', pipeline=test_pipeline ), )

test_dataloader = dict( batch_size=32, num_workers=2, dataset=dict( type='CustomDataset', data_prefix='data/YoriDataset_vgg/test/AXIAL', classes='data/classes.txt', ann_file='data/test_ann.txt', pipeline=test_pipeline ),
)

val_evaluator = dict(type='Accuracy', topk=(1, 5)) test_evaluator = val_evaluator

optim_wrapper = dict( optimizer=dict(type='SGD', lr=0.1, momentum=0.9, weight_decay=0.0001))

param_scheduler = dict( type='MultiStepLR', by_epoch=True, milestones=[30, 60, 90], gamma=0.1)

train_cfg = dict(by_epoch=True, max_epochs=100, val_interval=1) val_cfg = dict() test_cfg = dict()

auto_scale_lr = dict(base_batch_size=256)

default_scope = 'mmpretrain'

default_hooks = dict( timer=dict(type='IterTimerHook'),

logger=dict(type='LoggerHook', interval=100),

param_scheduler=dict(type='ParamSchedulerHook'),

checkpoint=dict(type='CheckpointHook', interval=1),

sampler_seed=dict(type='DistSamplerSeedHook'),

visualization=dict(type='VisualizationHook', enable=False),

)

env_cfg = dict( cudnn_benchmark=False,

mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0),

dist_cfg=dict(backend='nccl'),

)

vis_backends = [dict(type='LocalVisBackend')] # use local HDD backend visualizer = dict( type='UniversalVisualizer', vis_backends=vis_backends, name='visualizer')

log_level = 'INFO'

load_from = None

resume = False

Got Error: Traceback (most recent call last): File "./tools/train.py", line 162, in main() File "./tools/train.py", line 158, in main runner.train() File "/home/user/miniconda3/envs/open-mmlab/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1777, in train model = self.train_loop.run() # type: ignore File "/home/user/miniconda3/envs/open-mmlab/lib/python3.8/site-packages/mmengine/runner/loops.py", line 96, in run self.run_epoch() File "/home/user/miniconda3/envs/open-mmlab/lib/python3.8/site-packages/mmengine/runner/loops.py", line 112, in run_epoch self.run_iter(idx, data_batch) File "/home/user/miniconda3/envs/open-mmlab/lib/python3.8/site-packages/mmengine/runner/loops.py", line 128, in run_iter outputs = self.runner.model.train_step( File "/home/user/miniconda3/envs/open-mmlab/lib/python3.8/site-packages/mmengine/model/base_model/base_model.py", line 113, in train_step data = self.data_preprocessor(data, True) File "/home/user/miniconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/home/user/mmpretrain/mmpretrain/models/utils/data_preprocessor.py", line 109, in forward inputs = self.cast_data(data['inputs']) KeyError: 'inputs'

Environment

{'sys.platform': 'linux', 'Python': '3.8.18 (default, Sep 11 2023, 13:40:15) [GCC 11.2.0]', 'CUDA available': True, 'numpy_random_seed': 2147483648, 'GPU 0': 'NVIDIA GeForce RTX 3090', 'CUDA_HOME': '/usr/local/cuda-11.4', 'NVCC': 'Cuda compilation tools, release 11.4, V11.4.100', 'GCC': 'gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0', 'PyTorch': '1.10.1', 'TorchVision': '0.11.2', 'OpenCV': '4.8.1', 'MMEngine': '0.9.0', 'MMCV': '2.0.1', 'MMPreTrain': '1.1.0+a4c219e'}

Other information

My dataset folder have the same structure as this image: image

And the ann_files format like this: 'image path1' 'class number1' 'image path2' 'class number2' etc 'image path_n' 'class number_n'

For ex: class A: 1, class B: 2

Lorenzo23 commented 11 months ago

I have the same exact problem while running on CustomDataset

TianpengBu commented 11 months ago

You miss PackInputs in you train_pipeline. New version mmpretrain, the data pipeline should be like this:

train_pipeline = [ dict(type='LoadImageFromFile'), dict(type='RandomResizedCrop', scale=224), dict(type='RandomFlip', prob=0.5, direction='horizontal'), dict(type='PackInputs'), ]

TranTriDat commented 11 months ago

You miss PackInputs in you train_pipeline. New version mmpretrain, the data pipeline should be like this:

train_pipeline = [ dict(type='LoadImageFromFile'), dict(type='RandomResizedCrop', scale=224), dict(type='RandomFlip', prob=0.5, direction='horizontal'), dict(type='PackInputs'), ]

Oh, Ok thank you. But if I want to use the other models like VGG or Vit,... does the setting file the same as above with just replace the base and the pretrained link? Or I need to adding something else?

TianpengBu commented 11 months ago

One straightforward but messy way you can do is that: copying the corresponding data process into your custom config file and make sure PackInputs is in the pipeline.

TranTriDat commented 11 months ago

One straightforward but messy way you can do is that: copying the corresponding data process into your custom config file and make sure PackInputs is in the pipeline.

Oh thank you for the suggestion, currently I tried to trained to VGG model to classify, but the acc still stuck at 34% at 100 epochs. And with that config; I replace the resnet.py and resnet pretrained with the VGG files, also included the PackInputs inside the configs file. Do you have any advise for that problem?