open-mmlab / mmdetection

OpenMMLab Detection Toolbox and Benchmark
https://mmdetection.readthedocs.io
Apache License 2.0
29.06k stars 9.37k forks source link

How to build a dataloader with multiple datasets? Is the 'ConcatDataset' method available? #11115

Open lanhas opened 10 months ago

lanhas commented 10 months ago

I have several datasets, but for some reason I can't merge the data, I try to set it in the config file, but it fails. The error message is "RuntimeError: each data_itement in list of batch should be of equal size". I couldn't find the relevant information to solve it, here is my config file:

dataset_a = dict( type='CocoDataset', data_root=data_root_a, metainfo=metainfo, ann_file='annotations/coco_annotations_a.json', data_prefix=dict(img='images/'))

dataset_b = dict( type='CocoDataset', data_root=data_root_b, metainfo=metainfo, ann_file='annotations/coco_annotations_b.json', data_prefix=dict(img='images/'))

dataset_c = dict( type='CocoDataset', data_root=data_root_c, metainfo=metainfo, ann_file='annotations/coco_annotations_c.json', data_prefix=dict(img='images/'))

train_dataloader = dict( delete=True, batch_size=18, num_workers=4, persistent_workers=True,

sampler=dict(type='MultiSourceSampler', batch_size=18, source_ratio=[10, 4, 4]),

# batch_sampler=None,
sampler=dict(type='InfiniteSampler'),
batch_sampler=dict(type='AspectRatioBatchSampler'),
dataset=dict(
    type='ConcatDataset',
    datasets=[dataset_a, dataset_b, dataset_c]
),

)

The version I used is 3.2.0.

Nina0109 commented 7 months ago

Hi, I have the same problem. Have you solved this? If yes, can you give an example config file using concatdataset?