Closed yangxiaoyany closed 1 year ago
When i add metainfo =dict(CLASSES=('a','b',)) in my yolox_s_8xb8-300e_coco.py, there is no change. When i run the train code in mmdetection, the loss is normal. So I guess there may be mistake in the mmyolo code.
@yangxiaoyany Hi, I see that you did not add metainfo
to the above configuration. you need to provide the correct configuration.
num_classes = 4##### metainfo = dict( # 根据 class_with_id.txt 类别信息,设置 metainfo
# PALETTE=[(220, 20, 60)] # 画图时候的颜色,随便设置即可
CLASSES=('holothurian', 'echinus', 'scallop', 'starfish'),
PALETTE=[(220, 20, 60), (119, 11, 32), (0, 0, 142), (0, 0, 230)]
)hello,this is my metainfo。
hello, after I debuged for a long time, I found there is useless to add metainfo in the config file. because the code will always use the coco 80 class names as its category and in the yolox config file, there is no information about metainfo. my config file is same to the up code. Thank you very much for your reply.
python tools/analysis_tools/browse_dataset.py configs/custom/my_right_yolox_s.py --phase train
/home/yxy/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmengine/model/utils.py:138: UserWarning: Cannot import torch.fx, merge_dict
is a simple function to merge multiple dicts
warnings.warn('Cannot import torch.fx, merge_dict
is a simple function '
/home/yxy/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmdet/evaluation/metrics/lvis_metric.py:23: UserWarning: mmlvis is deprecated, please install official lvis-api by "pip install git+https://github.com/lvis-dataset/lvis-api.git"
UserWarning)
loading annotations into memory...
Done (t=0.37s)
creating index...
index created!
/home/yxy/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmengine/visualization/visualizer.py:170: UserWarning: Visualizer
backend is not initialized because save_dir is None.
warnings.warn('Visualizer
backend is not initialized '
{'classes': ('person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light', 'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush'), 'palette': [(220, 20, 60), (119, 11, 32), (0, 0, 142), (0, 0, 230), (106, 0, 228), (0, 60, 100), (0, 80, 100), (0, 0, 70), (0, 0, 192), (250, 170, 30), (100, 170, 30), (220, 220, 0), (175, 116, 175), (250, 0, 30), (165, 42, 42), (255, 77, 255), (0, 226, 252), (182, 182, 255), (0, 82, 0), (120, 166, 157), (110, 76, 0), (174, 57, 255), (199, 100, 0), (72, 0, 118), (255, 179, 240), (0, 125, 92), (209, 0, 151), (188, 208, 182), (0, 220, 176), (255, 99, 164), (92, 0, 73), (133, 129, 255), (78, 180, 255), (0, 228, 0), (174, 255, 243), (45, 89, 255), (134, 134, 103), (145, 148, 174), (255, 208, 186), (197, 226, 255), (171, 134, 1), (109, 63, 54), (207, 138, 255), (151, 0, 95), (9, 80, 61), (84, 105, 51), (74, 65, 105), (166, 196, 102), (208, 195, 210), (255, 109, 65), (0, 143, 149), (179, 0, 194), (209, 99, 106), (5, 121, 0), (227, 255, 205), (147, 186, 208), (153, 69, 1), (3, 95, 161), (163, 255, 0), (119, 0, 170), (0, 182, 199), (0, 165, 120), (183, 130, 88), (95, 32, 0), (130, 114, 135), (110, 129, 133), (166, 74, 118), (219, 142, 185), (79, 210, 114), (178, 90, 62), (65, 70, 15), (127, 167, 115), (59, 105, 106), (142, 108, 45), (196, 172, 0), (95, 54, 80), (128, 76, 255), (201, 57, 1), (246, 0, 122), (191, 162, 208)], 'CLASSES': ('holothurian', 'echinus', 'scallop', 'starfish'), 'PALETTE': [(220, 20, 60), (119, 11, 32), (0, 0, 142), (0, 0, 230)]}
my browse_dataset.py
import argparse import os.path as osp import sys from typing import Tuple
import cv2 import mmcv import numpy as np from mmdet.models.utils import mask2ndarray from mmdet.structures.bbox import BaseBoxes from mmengine.config import Config, DictAction from mmengine.dataset import Compose from mmengine.utils import ProgressBar from mmengine.visualization import Visualizer
from mmyolo.registry import DATASETS, VISUALIZERS from mmyolo.utils import register_all_modules
def parse_args(): parser = argparse.ArgumentParser(description='Browse a dataset') parser.add_argument('config', help='train config file path') parser.add_argument( '--phase', '-p', default='train', type=str, choices=['train', 'test', 'val'], help='phase of dataset to visualize, accept "train" "test" and "val".' ' Defaults to "train".') parser.add_argument( '--mode', '-m', default='transformed', type=str, choices=['original', 'transformed', 'pipeline'], help='display mode; display original pictures or ' 'transformed pictures or comparison pictures. "original" ' 'means show images load from disk; "transformed" means ' 'to show images after transformed; "pipeline" means show all ' 'the intermediate images. Defaults to "transformed".') parser.add_argument( '--output-dir', default=None, type=str, help='If there is no display interface, you can save it.') parser.add_argument('--not-show', default=False, action='store_true') parser.add_argument( '--show-number', '-n', type=int, default=sys.maxsize, help='number of images selected to visualize, ' 'must bigger than 0. if the number is bigger than length ' 'of dataset, show all the images in dataset; ' 'default "sys.maxsize", show all images in dataset') parser.add_argument( '--show-interval', '-i', type=float, default=3, help='the interval of show (s)') parser.add_argument( '--cfg-options', nargs='+', action=DictAction, help='override some settings in the used config, the key-value pair ' 'in xxx=yyy format will be merged into config file. If the value to ' 'be overwritten is a list, it should be like key="[a,b]" or key=a,b ' 'It also allows nested list/tuple values, e.g. key="[(a,b),(c,d)]" ' 'Note that the quotation marks are necessary and that no white space ' 'is allowed.') args = parser.parse_args() return args
def _get_adaptive_scale(img_shape: Tuple[int, int], min_scale: float = 0.3, max_scale: float = 3.0) -> float: """Get adaptive scale according to image shape.
The target scale depends on the the short edge length of the image. If the
short edge length equals 224, the output is 1.0. And output linear
scales according the short edge length. You can also specify the minimum
scale and the maximum scale to limit the linear scale.
Args:
img_shape (Tuple[int, int]): The shape of the canvas image.
min_scale (int): The minimum scale. Defaults to 0.3.
max_scale (int): The maximum scale. Defaults to 3.0.
Returns:
int: The adaptive scale.
"""
short_edge_length = min(img_shape)
scale = short_edge_length / 224.
return min(max(scale, min_scale), max_scale)
def make_grid(imgs, names): """Concat list of pictures into a single big picture, align height here.""" visualizer = Visualizer.get_current_instance() ori_shapes = [img.shape[:2] for img in imgs] max_height = int(max(img.shape[0] for img in imgs) * 1.1) min_width = min(img.shape[1] for img in imgs) horizontal_gap = min_width // 10 img_scale = _get_adaptive_scale((max_height, min_width))
texts = []
text_positions = []
start_x = 0
for i, img in enumerate(imgs):
pad_height = (max_height - img.shape[0]) // 2
pad_width = horizontal_gap // 2
# make border
imgs[i] = cv2.copyMakeBorder(
img,
pad_height,
max_height - img.shape[0] - pad_height + int(img_scale * 30 * 2),
pad_width,
pad_width,
cv2.BORDER_CONSTANT,
value=(255, 255, 255))
texts.append(f'{"execution: "}{i}\n{names[i]}\n{ori_shapes[i]}')
text_positions.append(
[start_x + img.shape[1] // 2 + pad_width, max_height])
start_x += img.shape[1] + horizontal_gap
display_img = np.concatenate(imgs, axis=1)
visualizer.set_image(display_img)
img_scale = _get_adaptive_scale(display_img.shape[:2])
visualizer.draw_texts(
texts,
positions=np.array(text_positions),
font_sizes=img_scale * 7,
colors='black',
horizontal_alignments='center',
font_families='monospace')
return visualizer.get_image()
class InspectCompose(Compose): """Compose multiple transforms sequentially.
And record "img" field of all results in one list.
"""
def __init__(self, transforms, intermediate_imgs):
super().__init__(transforms=transforms)
self.intermediate_imgs = intermediate_imgs
def __call__(self, data):
if 'img' in data:
self.intermediate_imgs.append({
'name': 'original',
'img': data['img'].copy()
})
self.ptransforms = [
self.transforms[i] for i in range(len(self.transforms) - 1)
]
for t in self.ptransforms:
data = t(data)
# Keep the same meta_keys in the PackDetInputs
self.transforms[-1].meta_keys = [key for key in data]
data_sample = self.transforms[-1](data)
if data is None:
return None
if 'img' in data:
self.intermediate_imgs.append({
'name':
t.__class__.__name__,
'dataset_sample':
data_sample['data_samples']
})
return data
def main(): args = parse_args() cfg = Config.fromfile(args.config) if args.cfg_options is not None: cfg.merge_from_dict(args.cfg_options)
# register all modules in mmyolo into the registries
register_all_modules()
dataset_cfg = cfg.get(args.phase + '_dataloader').get('dataset')
dataset = DATASETS.build(dataset_cfg)
visualizer = VISUALIZERS.build(cfg.visualizer)
visualizer.dataset_meta = dataset.metainfo
print(visualizer.dataset_meta)
intermediate_imgs = []
# print(dataset)
# print(aaaa)
# TODO: The dataset wrapper occasion is not considered here
dataset.pipeline = InspectCompose(dataset.pipeline.transforms,
intermediate_imgs)
# init visualization image number
assert args.show_number > 0
display_number = min(args.show_number, len(dataset))
progress_bar = ProgressBar(display_number)
for i, item in zip(range(display_number), dataset):
image_i = []
result_i = [result['dataset_sample'] for result in intermediate_imgs]
for k, datasample in enumerate(result_i):
image = datasample.img
gt_instances = datasample.gt_instances
image = image[..., [2, 1, 0]] # bgr to rgb
gt_bboxes = gt_instances.get('bboxes', None)
if gt_bboxes is not None and isinstance(gt_bboxes, BaseBoxes):
gt_instances.bboxes = gt_bboxes.tensor
gt_masks = gt_instances.get('masks', None)
if gt_masks is not None:
masks = mask2ndarray(gt_masks)
gt_instances.masks = masks.astype(np.bool)
datasample.gt_instances = gt_instances
# get filename from dataset or just use index as filename
visualizer.add_datasample(
'result',
image,
datasample,
draw_pred=False,
draw_gt=True,
show=False)
image_show = visualizer.get_image()
image_i.append(image_show)
if args.mode == 'original':
image = image_i[0]
elif args.mode == 'transformed':
image = image_i[-1]
else:
image = make_grid([result for result in image_i],
[result['name'] for result in intermediate_imgs])
if hasattr(datasample, 'img_path'):
filename = osp.basename(datasample.img_path)
else:
# some dataset have not image path
filename = f'{i}.jpg'
out_file = osp.join(args.output_dir,
filename) if args.output_dir is not None else None
if out_file is not None:
mmcv.imwrite(image[..., ::-1], out_file)
if not args.not_show:
visualizer.show(
image, win_name=filename, wait_time=args.show_interval)
intermediate_imgs.clear()
progress_bar.update()
if name == 'main': main()
@yangxiaoyany Hi, I see that you did not add
metainfo
to the above configuration. you need to provide the correct configuration.
Hi @yangxiaoyany 请将 './work_dirs/yolox_s_8xb8-300e_coco' 下面最新生成的 config 贴上来看下
Hi @yangxiaoyany 请将 './work_dirs/yolox_s_8xb8-300e_coco' 下面最新生成的 config 贴上来看下
Hello, thanks for your reply. this is the newest config file in './work_dirs/yolox_s_8xb8-300e_coco'.
default_scope = 'mmyolo'
default_hooks = dict(
timer=dict(type='IterTimerHook'),
logger=dict(type='LoggerHook', interval=10),
param_scheduler=dict(type='ParamSchedulerHook'),
checkpoint=dict(
type='CheckpointHook', interval=2, max_keep_ckpts=5, save_best='auto'),
sampler_seed=dict(type='DistSamplerSeedHook'),
visualization=dict(type='mmdet.DetVisualizationHook'))
env_cfg = dict(
cudnn_benchmark=False,
mp_cfg=dict(mp_start_method='fork', opencv_num_threads=0),
dist_cfg=dict(backend='nccl'))
vis_backends = [dict(type='LocalVisBackend')]
visualizer = dict(
type='mmdet.DetLocalVisualizer',
vis_backends=[dict(type='LocalVisBackend')],
name='visualizer')
log_processor = dict(type='LogProcessor', window_size=50, by_epoch=True)
log_level = 'INFO'
load_from = '/home/yxy/下载/yolox_s_8xb8-300e_coco_20220917_030738-d7e60cb2.pth'
resume = False
file_client_args = dict(backend='disk')
data_root = '/home/yxy/mmdetection/data/coco/'
dataset_type = 'YOLOv5CocoDataset'
img_scale = (640, 640)
deepen_factor = 0.33
widen_factor = 0.5
save_epoch_intervals = 2
train_batch_size_per_gpu = 16
train_num_workers = 4
val_batch_size_per_gpu = 1
val_num_workers = 2
max_epochs = 300
num_last_epochs = 15
model = dict(
type='YOLODetector',
init_cfg=dict(
type='Kaiming',
layer='Conv2d',
a=2.23606797749979,
distribution='uniform',
mode='fan_in',
nonlinearity='leaky_relu'),
use_syncbn=False,
data_preprocessor=dict(
type='mmdet.DetDataPreprocessor',
pad_size_divisor=32,
batch_augments=[
dict(
type='mmdet.BatchSyncRandomResize',
random_size_range=(480, 800),
size_divisor=32,
interval=10)
]),
backbone=dict(
type='YOLOXCSPDarknet',
deepen_factor=0.33,
widen_factor=0.5,
out_indices=(2, 3, 4),
spp_kernal_sizes=(5, 9, 13),
norm_cfg=dict(type='BN', momentum=0.03, eps=0.001),
act_cfg=dict(type='SiLU', inplace=True)),
neck=dict(
type='YOLOXPAFPN',
deepen_factor=0.33,
widen_factor=0.5,
in_channels=[256, 512, 1024],
out_channels=256,
norm_cfg=dict(type='BN', momentum=0.03, eps=0.001),
act_cfg=dict(type='SiLU', inplace=True)),
bbox_head=dict(
type='YOLOXHead',
head_module=dict(
type='YOLOXHeadModule',
num_classes=4,
in_channels=256,
feat_channels=256,
widen_factor=0.5,
stacked_convs=2,
featmap_strides=(8, 16, 32),
use_depthwise=False,
norm_cfg=dict(type='BN', momentum=0.03, eps=0.001),
act_cfg=dict(type='SiLU', inplace=True)),
loss_cls=dict(
type='mmdet.CrossEntropyLoss',
use_sigmoid=True,
reduction='sum',
loss_weight=0.07500000000000001),
loss_bbox=dict(
type='mmdet.IoULoss',
mode='square',
eps=1e-16,
reduction='sum',
loss_weight=5.0),
loss_obj=dict(
type='mmdet.CrossEntropyLoss',
use_sigmoid=True,
reduction='sum',
loss_weight=1.0),
loss_bbox_aux=dict(
type='mmdet.L1Loss', reduction='sum', loss_weight=1.0)),
train_cfg=dict(
assigner=dict(
type='mmdet.SimOTAAssigner',
center_radius=2.5,
iou_calculator=dict(type='mmdet.BboxOverlaps2D'))),
test_cfg=dict(
yolox_style=True,
multi_label=True,
score_thr=0.001,
max_per_img=300,
nms=dict(type='nms', iou_threshold=0.65)))
pre_transform = [
dict(type='LoadImageFromFile', file_client_args=dict(backend='disk')),
dict(type='LoadAnnotations', with_bbox=True)
]
train_pipeline_stage1 = [
dict(type='LoadImageFromFile', file_client_args=dict(backend='disk')),
dict(type='LoadAnnotations', with_bbox=True),
dict(
type='Mosaic',
img_scale=(320, 320),
pad_val=114.0,
pre_transform=[
dict(
type='LoadImageFromFile',
file_client_args=dict(backend='disk')),
dict(type='LoadAnnotations', with_bbox=True)
]),
dict(
type='mmdet.RandomAffine',
scaling_ratio_range=(0.1, 2),
border=(-160, -160)),
dict(
type='YOLOXMixUp',
img_scale=(320, 320),
ratio_range=(0.8, 1.6),
pad_val=114.0,
pre_transform=[
dict(
type='LoadImageFromFile',
file_client_args=dict(backend='disk')),
dict(type='LoadAnnotations', with_bbox=True)
]),
dict(type='mmdet.YOLOXHSVRandomAug'),
dict(type='mmdet.RandomFlip', prob=0.5),
dict(
type='mmdet.FilterAnnotations',
min_gt_bbox_wh=(1, 1),
keep_empty=False),
dict(
type='mmdet.PackDetInputs',
meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape', 'flip',
'flip_direction'))
]
train_pipeline_stage2 = [
dict(type='LoadImageFromFile', file_client_args=dict(backend='disk')),
dict(type='LoadAnnotations', with_bbox=True),
dict(type='mmdet.Resize', scale=(320, 320), keep_ratio=True),
dict(
type='mmdet.Pad',
pad_to_square=True,
pad_val=dict(img=(114.0, 114.0, 114.0))),
dict(type='mmdet.YOLOXHSVRandomAug'),
dict(type='mmdet.RandomFlip', prob=0.5),
dict(
type='mmdet.FilterAnnotations',
min_gt_bbox_wh=(1, 1),
keep_empty=False),
dict(type='mmdet.PackDetInputs')
]
train_dataloader = dict(
batch_size=16,
num_workers=4,
persistent_workers=True,
pin_memory=False,
sampler=dict(type='DefaultSampler', shuffle=True),
dataset=dict(
type='YOLOv5CocoDataset',
data_root='/home/yxy/mmdetection/data/coco/',
ann_file=
'/home/yxy/mmdetection/data/coco/annotations/instances_train2017.json',
data_prefix=dict(img='train2017/'),
filter_cfg=dict(filter_empty_gt=False, min_size=32),
pipeline=[
dict(
type='LoadImageFromFile',
file_client_args=dict(backend='disk')),
dict(type='LoadAnnotations', with_bbox=True),
dict(
type='Mosaic',
img_scale=(320, 320),
pad_val=114.0,
pre_transform=[
dict(
type='LoadImageFromFile',
file_client_args=dict(backend='disk')),
dict(type='LoadAnnotations', with_bbox=True)
]),
dict(
type='mmdet.RandomAffine',
scaling_ratio_range=(0.1, 2),
border=(-160, -160)),
dict(
type='YOLOXMixUp',
img_scale=(320, 320),
ratio_range=(0.8, 1.6),
pad_val=114.0,
pre_transform=[
dict(
type='LoadImageFromFile',
file_client_args=dict(backend='disk')),
dict(type='LoadAnnotations', with_bbox=True)
]),
dict(type='mmdet.YOLOXHSVRandomAug'),
dict(type='mmdet.RandomFlip', prob=0.5),
dict(
type='mmdet.FilterAnnotations',
min_gt_bbox_wh=(1, 1),
keep_empty=False),
dict(
type='mmdet.PackDetInputs',
meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',
'flip', 'flip_direction'))
]))
test_pipeline = [
dict(type='LoadImageFromFile', file_client_args=dict(backend='disk')),
dict(type='mmdet.Resize', scale=(640, 640), keep_ratio=True),
dict(
type='mmdet.Pad',
pad_to_square=True,
pad_val=dict(img=(114.0, 114.0, 114.0))),
dict(type='LoadAnnotations', with_bbox=True, _scope_='mmdet'),
dict(
type='mmdet.PackDetInputs',
meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',
'scale_factor'))
]
val_dataloader = dict(
batch_size=1,
num_workers=2,
persistent_workers=True,
pin_memory=False,
drop_last=False,
sampler=dict(type='DefaultSampler', shuffle=False),
dataset=dict(
type='YOLOv5CocoDataset',
data_root='/home/yxy/mmdetection/data/coco/',
ann_file=
'/home/yxy/mmdetection/data/coco/annotations/instances_val2017.json',
data_prefix=dict(img='train2017/'),
test_mode=True,
pipeline=[
dict(
type='LoadImageFromFile',
file_client_args=dict(backend='disk')),
dict(type='mmdet.Resize', scale=(640, 640), keep_ratio=True),
dict(
type='mmdet.Pad',
pad_to_square=True,
pad_val=dict(img=(114.0, 114.0, 114.0))),
dict(type='LoadAnnotations', with_bbox=True, _scope_='mmdet'),
dict(
type='mmdet.PackDetInputs',
meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',
'scale_factor'))
],
metainfo=dict(
CLASSES=('holothurian', 'echinus', 'scallop', 'starfish'),
PALETTE=[(220, 20, 60), (119, 11, 32), (0, 0, 142), (0, 0, 230)])))
test_dataloader = dict(
batch_size=1,
num_workers=2,
persistent_workers=True,
pin_memory=False,
drop_last=False,
sampler=dict(type='DefaultSampler', shuffle=False),
dataset=dict(
type='YOLOv5CocoDataset',
data_root='/home/yxy/mmdetection/data/coco/',
ann_file=
'/home/yxy/mmdetection/data/coco/annotations/instances_val2017.json',
data_prefix=dict(img='train2017/'),
test_mode=True,
pipeline=[
dict(
type='LoadImageFromFile',
file_client_args=dict(backend='disk')),
dict(type='mmdet.Resize', scale=(640, 640), keep_ratio=True),
dict(
type='mmdet.Pad',
pad_to_square=True,
pad_val=dict(img=(114.0, 114.0, 114.0))),
dict(type='LoadAnnotations', with_bbox=True, _scope_='mmdet'),
dict(
type='mmdet.PackDetInputs',
meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',
'scale_factor'))
],
metainfo=dict(
CLASSES=('holothurian', 'echinus', 'scallop', 'starfish'),
PALETTE=[(220, 20, 60), (119, 11, 32), (0, 0, 142), (0, 0, 230)])))
val_evaluator = dict(
type='mmdet.CocoMetric',
proposal_nums=(100, 1, 10),
ann_file=
'/home/yxy/mmdetection/data/coco/annotations/instances_val2017.json',
metric='bbox')
test_evaluator = dict(
type='mmdet.CocoMetric',
proposal_nums=(100, 1, 10),
ann_file=
'/home/yxy/mmdetection/data/coco/annotations/instances_val2017.json',
metric='bbox')
base_lr = 0.01
optim_wrapper = dict(
type='OptimWrapper',
optimizer=dict(
type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0005, nesterov=True),
paramwise_cfg=dict(norm_decay_mult=0.0, bias_decay_mult=0.0))
param_scheduler = [
dict(
type='mmdet.QuadraticWarmupLR',
by_epoch=True,
begin=0,
end=5,
convert_to_iter_based=True),
dict(
type='CosineAnnealingLR',
eta_min=0.0005,
begin=5,
T_max=285,
end=285,
by_epoch=True,
convert_to_iter_based=True),
dict(type='ConstantLR', by_epoch=True, factor=1, begin=285, end=300)
]
custom_hooks = [
dict(
type='YOLOXModeSwitchHook',
num_last_epochs=15,
new_train_pipeline=[
dict(
type='LoadImageFromFile',
file_client_args=dict(backend='disk')),
dict(type='LoadAnnotations', with_bbox=True),
dict(type='mmdet.Resize', scale=(640, 640), keep_ratio=True),
dict(
type='mmdet.Pad',
pad_to_square=True,
pad_val=dict(img=(114.0, 114.0, 114.0))),
dict(type='mmdet.YOLOXHSVRandomAug'),
dict(type='mmdet.RandomFlip', prob=0.5),
dict(
type='mmdet.FilterAnnotations',
min_gt_bbox_wh=(1, 1),
keep_empty=False),
dict(type='mmdet.PackDetInputs')
],
priority=48),
dict(type='mmdet.SyncNormHook', priority=48),
dict(
type='EMAHook',
ema_type='ExpMomentumEMA',
momentum=0.0001,
update_buffers=True,
strict_load=False,
priority=49)
]
train_cfg = dict(
type='EpochBasedTrainLoop',
max_epochs=300,
val_interval=1,
dynamic_intervals=[(285, 1)])
auto_scale_lr = dict(base_batch_size=64)
val_cfg = dict(type='ValLoop')
test_cfg = dict(type='TestLoop')
work_dir = './work_dirs/my_right_yolox_s'
num_classes = 4
metainfo = dict(
CLASSES=('holothurian', 'echinus', 'scallop', 'starfish'),
PALETTE=[(220, 20, 60), (119, 11, 32), (0, 0, 142), (0, 0, 230)])
launcher = 'none'
Hi @yangxiaoyany
Please run python mmyolo/utils/collect_env.py
to collect necessary environment information and paste it here.
You may add addition that may be helpful for locating the problem, such as
$PATH
, $LD_LIBRARY_PATH
, $PYTHONPATH
, etc.)Hi @yangxiaoyany Please run
python mmyolo/utils/collect_env.py
to collect necessary environment information and paste it here. You may add addition that may be helpful for locating the problem, such as - How you installed PyTorch [e.g., pip, conda, source] - Other environment variables that may be related (such as$PATH
,$LD_LIBRARY_PATH
,$PYTHONPATH
, etc.)
the environment is a old environment to run mmdetection, I changed the mmdet and mmcv version to match the mmyolo.
sys.platform: linux
Python: 3.7.10 (default, Feb 26 2021, 18:47:35) [GCC 7.3.0]
CUDA available: True
numpy_random_seed: 2147483648
GPU 0,1: NVIDIA GeForce RTX 2080 Ti
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 10.2, V10.2.8
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.7.0
PyTorch compiling details: PyTorch built with:
- GCC 7.3
- C++ Version: 201402
- Intel(R) Math Kernel Library Version 2020.0.4 Product Build 20200917 for Intel(R) 64 architecture applications
- Intel(R) MKL-DNN v1.6.0 (Git Hash 5ef631a030a6f73131c77892041042805a06064f)
- OpenMP 201511 (a.k.a. OpenMP 4.5)
- NNPACK is enabled
- CPU capability usage: AVX2
- CUDA Runtime 10.1
- NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
- CuDNN 7.6.3
- Magma 2.5.2
- Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_VULKAN_WRAPPER -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,
TorchVision: 0.8.1
OpenCV: 4.5.1
MMEngine: 0.3.2
MMCV: 2.0.0rc3
MMDetection: 3.0.0rc5
MMYOLO: 0.2.0+
I also found when I browse the cat dataset followed the official instruction, the output dataset shows person category in the picture, this maybe the reason loss not 0. But when I browse my dataset, there is no gtbbox.
Hi @yangxiaoyany 因为最近 OpenMMLab 在升级一些细节问题,故请使用 mmdet3.0.0rc4
I changed the mmdet version to mmdet3.0.0rc4, there is no change.
you haven't set metainfo
in train_dataloader
...
you haven't set
metainfo
intrain_dataloader
...
Thank you very much. It solved the question. I forgot to add metainfo in the trainloader.
Okay, Thx for using MMYOLO 😄
you haven't set
metainfo
intrain_dataloader
...Thank you very much. It solved the question. I forgot to add metainfo in the trainloader.
I added metainfo in the trainloader but my loss_cls and loss_bbox are still 0.0000 .
Prerequisite
💬 Describe the reimplementation questions
2022/12/27 17:08:37 - mmengine - INFO - Epoch(train) [1][300/969] lr: 3.8340e-05 eta: 20:48:18 time: 0.2513 data_time: 0.0467 memory: 5362 loss: 0.7244 loss_cls: 0.0000 loss_bbox: 0.0000 loss_obj: 0.7244 2022/12/27 17:08:49 - mmengine - INFO - Epoch(train) [1][350/969] lr: 5.2185e-05 eta: 20:35:15 time: 0.2394 data_time: 0.0772 memory: 5362 loss: 0.4497 loss_cls: 0.0000 loss_bbox: 0.0000 loss_obj: 0.4497 2022/12/27 17:09:01 - mmengine - INFO - Epoch(train) [1][400/969] lr: 6.8160e-05 eta: 20:31:31 time: 0.2494 data_time: 0.0234 memory: 5362 loss: 0.5124 loss_cls: 0.0000 loss_bbox: 0.0000 loss_obj: 0.5124 2022/12/27 17:09:13 - mmengine - INFO - Epoch(train) [1][450/969] lr: 8.6266e-05 eta: 20:18:02 time: 0.2299 data_time: 0.0601 memory: 3875 loss: 0.2986 loss_cls: 0.0000 loss_bbox: 0.0000 loss_obj: 0.2986 2022/12/27 17:09:26 - mmengine - INFO - Epoch(train) [1][500/969] lr: 1.0650e-04 eta: 20:20:20 time: 0.2570 data_time: 0.0642 memory: 4936 loss: 0.2919 loss_cls: 0.0000 loss_bbox: 0.0000 loss_obj: 0.2919 2022/12/27 17:09:38 - mmengine - INFO - Epoch(train) [1][550/969] lr: 1.2887e-04 eta: 20:17:15 time: 0.2458 data_time: 0.0498 memory: 4228 loss: 0.2368 loss_cls: 0.0000 loss_bbox: 0.0000 loss_obj: 0.2368 2022/12/27 17:09:51 - mmengine - INFO - Epoch(train) [1][600/969] lr: 1.5336e-04 eta: 20:18:08 time: 0.2544 data_time: 0.0160 memory: 5362 loss: 0.2318 loss_cls: 0.0000 loss_bbox: 0.0000 loss_obj: 0.2318 2022/12/27 17:10:03 - mmengine - INFO - Epoch(train) [1][650/969] lr: 1.7999e-04 eta: 20:14:28 time: 0.2427 data_time: 0.0518 memory: 4552 loss: 0.1511 loss_cls: 0.0000 loss_bbox: 0.0000 loss_obj: 0.1511 2022/12/27 17:10:15 - mmengine - INFO - Epoch(train) [1][700/969] lr: 2.0874e-04 eta: 20:14:07 time: 0.2508 data_time: 0.0011 memory: 5362 loss: 0.1640 loss_cls: 0.0000 loss_bbox: 0.0000 loss_obj: 0.1640 2022/12/27 17:10:29 - mmengine - INFO - Epoch(train) [1][750/969] lr: 2.3963e-04 eta: 20:18:31 time: 0.2655 data_time: 0.0669 memory: 5362 loss: 0.1068 loss_cls: 0.0000 loss_bbox: 0.0000 loss_obj: 0.1068 2022/12/27 17:10:40 - mmengine - INFO - Epoch(train) [1][800/969] lr: 2.7264e-04 eta: 20:12:52 time: 0.2341 data_time: 0.0551 memory: 4936 loss: 0.0842 loss_cls: 0.0000 loss_bbox: 0.0000 loss_obj: 0.0842 2022/12/27 17:10:53 - mmengine - INFO - Epoch(train) [1][850/969] lr: 3.0779e-04 eta: 20:10:42 time: 0.2442 data_time: 0.0178 memory: 5362 loss: 0.0912 loss_cls: 0.0000 loss_bbox: 0.0000 loss_obj: 0.0912 2022/12/27 17:11:05 - mmengine - INFO - Epoch(train) [1][900/969] lr: 3.4506e-04 eta: 20:09:06 time: 0.2454 data_time: 0.0643 memory: 4936 loss: 0.0606 loss_cls: 0.0000 loss_bbox: 0.0000 loss_obj: 0.0606
Environment
I installed the environment followed the mmyolo install instructions.
Expected results
No response
Additional information
I only revised the num_classes to 4, and I revised the path of my datsset. and I check my dataset with the browse code, there is no mistake.