w1oves / Rein

[CVPR 2024] Official implement of <Stronger, Fewer, & Superior: Harnessing Vision Foundation Models for Domain Generalized Semantic Segmentation>
https://zxwei.site/rein
GNU General Public License v3.0
215 stars 19 forks source link

some difficulties training my own dataset #34

Closed Zhangyao2414 closed 4 months ago

Zhangyao2414 commented 4 months ago

您好,

非常感谢您能够将您的代码开源!

在下载了您的代码后,我做了以下操作: 1.下载了预训练模型"dinov2_vitl14_pretrain.pth"并使用”python tools/convert_models/convert_dinov2.py checkpoints/dinov2_vitl14_pretrain.pth checkpoints/dinov2_converted.pth“语句生成了"dinov2_converted.pth"文件 2.在Rein-train文件夹下建立了"data"。data下有两个文件:images和labels,这两个文件夹下又分别存在两个文件:train和val,train和val下都是我自己的'.png'文件,分别表示训练集和验证集。

image

3.复制"configs/base/datasets"文件夹下的"cityscapes_512x512.py"文件并重命名为"Cfg2.py",修改了文件中对应的路径为自己的数据集。 4.复制"configs/dinov2"文件夹下的"rein_dinov2_mask2former_512x512_bs1x4.py",并重命名为"Cfg1.py",修改了对应的"base"路径。 最后我运行"python tools/train.py configs/dinov2/Cfg1.py"后出现了"ValueError: val_dataloader, val_cfg, and val_evaluator should be either all None or not None, but got val_dataloader=None, val_cfg={'type': 'ValLoop'}, val_evaluator=None"这样的报错,在"Cfg1.py"将val_cfg和test_cfg注释之后又出现了"KeyError: 'cfg or default_args must contain the key "type", but got {\'pipeline\': [{\'type\': \'LoadImageFromFile\'}, {\'type\': \'LoadAnnotations\'}, {\'type\': \'RandomChoiceResize\', \'scales\': [256, 307, 358, 409, 460], \'resize_type\': \'ResizeShortestEdge\', \'max_size\': 2048}, {\'type\': \'RandomCrop\', \'crop_size\': (512, 512), \'cat_max_ratio\': 0.75}, {\'type\': \'RandomFlip\', \'prob\': 0.5}, {\'type\': \'PhotoMetricDistortion\'}, {\'type\': \'PackSegInputs\'}]}\nNone'"的错误。 这是我数据集设置的问题还是代码修改的问题呢?可以麻烦您针对这种情况给我一些建议吗? 期待您的回复!

w1oves commented 4 months ago

请上传对应的config文件

Zhangyao2414 commented 4 months ago

————————————————————————————Cfg2———————————————————— dataset_type = "SEMDataset" dataset_root = "/Users/zhangyao2414/Downloads/Rein-train/data" dataset_crop_size = (512, 512)

图像处理步骤(train)

train_pipeline = [ dict(type="LoadImageFromFile"), dict(type="LoadAnnotations"), dict(type="Resize", scale=(1024, 512)), dict(type="RandomCrop", crop_size=dataset_crop_size, cat_max_ratio=0.75), dict(type="RandomFlip", prob=0.5), dict(type="PhotoMetricDistortion"), dict(type="PackSegInputs"), ]

图像处理步骤(test)

test_pipeline = [ dict(type="LoadImageFromFile"), dict(type="Resize", scale=(1024, 512), keep_ratio=True),

add loading annotation after Resize because ground truth

# does not need to do resize data transform
dict(type="LoadAnnotations"),
dict(type="PackSegInputs"),

] train_cityscapes = dict( type=dataset_type, data_root=dataset_root, data_prefix=dict( img_path="images/train", seg_map_path="labels/train", ), pipeline=train_pipeline, ) val_cityscapes = dict( type=dataset_type, data_root=dataset_root, data_prefix=dict( img_path="images/val", seg_map_path="labels/val", ), pipeline=test_pipeline, ) ———————————————————————————Cfg1———————————————————————

dataset config

定义了一个基于"Mask2former"和"DINOv2"的深度学习语义分割模型的训练配置

"base"包含了基础配置文件,分别是数据集、运行时配置、以及模型结构

base = [ "../base/datasets/Cfg2.py", "../base/default_runtime.py", "../base/models/dinov2_mask2former.py" ]

定义了训练数据的预处理步骤

train_pipeline = [ dict(type="LoadImageFromFile"), # 从文件中加载图像 dict(type="LoadAnnotations"), # 加载图像的注释信息 dict( type="RandomChoiceResize", # 随机选择缩放尺寸 scales=[int(512 x 0.1) for x in range(5, 10)], resize_type="ResizeShortestEdge", max_size=2048, ), dict(type="RandomCrop", crop_size={{base.crop_size}}, cat_max_ratio=0.75), # 随机裁剪 dict(type="RandomFlip", prob=0.5), # 随机翻转 dict(type="PhotoMetricDistortion"), # 进行光度畸变 dict(type="PackSegInputs"), # 打包输入为模型的输入格式 ] train_dataloader = dict(batch_size=4, dataset=dict(pipeline=train_pipeline))

AdamW optimizer, no weight decay for position embedding & layer norm

in backbone

embed_multi = dict(lr_mult=1.0, decay_mult=0.0)

设置优化器

optim_wrapper = dict( constructor="PEFTOptimWrapperConstructor", optimizer=dict( type="AdamW", lr=0.0001, weight_decay=0.05, eps=1e-8, betas=(0.9, 0.999) ), paramwise_cfg=dict( custom_keys={ "norm": dict(decay_mult=0.0), "query_embed": embed_multi, "level_embed": embed_multi, "learnable_tokens": embed_multi, "reins.scale": embed_multi, }, norm_decay_mult=0.0, ), )

配置学习率调度器

param_scheduler = [ dict(type="PolyLR", eta_min=0, power=0.9, begin=0, end=40000, by_epoch=False) ]

training schedule for 160k

train_cfg = dict(type="IterBasedTrainLoop", max_iters=40000, val_interval=10000)

val_cfg = dict(type="ValLoop")

test_cfg = dict(type="TestLoop")

default_hooks = dict( timer=dict(type="IterTimerHook"), logger=dict(type="LoggerHook", interval=50, log_metric_by_epoch=False), param_scheduler=dict(type="ParamSchedulerHook"), checkpoint=dict( type="CheckpointHook", by_epoch=False, interval=4000, max_keep_ckpts=3 ), sampler_seed=dict(type="DistSamplerSeedHook"), visualization=dict(type="SegVisualizationHook"), )

w1oves commented 4 months ago

在Cfg2中,你需要为train,val和test分别定义一个合适的dataloader,我没有看到你设置的dataloader。请参考原有dataset配置及mmseg文档。

Zhangyao2414 commented 4 months ago

作者你好!抱歉再来打扰你。参照你另外文件的Dataset配置之后,今天我在Cfg2文件中添加了如下的train、val和test的dataloader后,代码能够顺利的运行了。 ————————————————————Cfg2—————————————————————————————————— dataset_type = "SEMDataset" dataset_root = "/Users/zhangyao2414/Downloads/Rein-train/data/SEM_data" dataset_crop_size = (512, 512)

train_pipeline = [ dict(type="LoadImageFromFile"), dict(type="LoadAnnotations"), dict(type="Resize", scale=(512, 512)), dict(type="RandomCrop", crop_size=dataset_crop_size, cat_max_ratio=0.75), dict(type="RandomFlip", prob=0.5), dict(type="PhotoMetricDistortion"), dict(type="PackSegInputs"), ]

test_pipeline = [ dict(type="LoadImageFromFile"), dict(type="Resize", scale=(512, 512), keep_ratio=True), dict(type="LoadAnnotations"), dict(type="PackSegInputs"), ]

train_cityscapes = dict( type=dataset_type, data_root=dataset_root, data_prefix=dict( img_path="images/train", seg_map_path="labels/train", ), img_suffix=".png", seg_map_suffix=".png", pipeline=train_pipeline, )

val_cityscapes = dict( type=dataset_type, data_root=dataset_root, data_prefix=dict( img_path="images/val", seg_map_path="labels/val", ), img_suffix=".png", seg_map_suffix=".png", pipeline=test_pipeline, )

test_cityscapes = val_cityscapes

train_dataloader = dict( batch_size=4, num_workers=1, persistent_workers=True, pin_memory=True, sampler=dict(type="InfiniteSampler", shuffle=True), dataset=train_cityscapes, )

val_dataloader = dict( batch_size=1, num_workers=1, persistent_workers=True, sampler=dict(type="DefaultSampler", shuffle=False), dataset=val_cityscapes, )

test_dataloader = val_dataloader

val_evaluator = dict( type="DGIoUMetric", iou_metrics=["mIoU"], dataset_keys=["citys"], format_only=True, output_dir="work_dirs/eval", ) test_evaluator = val_evaluator ———————————————————————————————————————————————— 但是运行时界面一直停留在一个位置,我不知道这是否表示成功运行?

image

如果不是,我想请问一下这可能是什么原因造成的呢?成功运行的情况应该是什么样子的呢?

w1oves commented 4 months ago

这是不合理的,正确情况下应该会出现训练信息。你可以加下我微信:wzx-vi

prithwijit-shl commented 1 month ago

Hi. I am getting this exact same warning for "The prefix is not set in metric class DGIoUMetric." while training on my own dataset. Did you find out a fix?