Closed victoic closed 2 years ago
@victoic Please post your configuration.
As printed by pretty_text
Config:
dataset_type = 'CocoSubsetDataset'
data_root = 'data/coco'
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='LoadAnnotations', with_bbox=True, with_mask=True, with_seg=True),
dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.5),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='SegRescale', scale_factor=0.125),
dict(type='DefaultFormatBundle'),
dict(
type='Collect',
keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks', 'gt_semantic_seg'])
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1333, 800),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.5),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img'])
])
]
data = dict(
samples_per_gpu=2,
workers_per_gpu=0,
train=dict(
type='CocoSubsetDataset',
ann_file='annotations/COCO-subset.json',
img_prefix='images/train',
pipeline=[
dict(type='LoadImageFromFile'),
dict(
type='LoadAnnotations',
with_bbox=True,
with_mask=True,
with_seg=True),
dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.5),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='SegRescale', scale_factor=0.125),
dict(type='DefaultFormatBundle'),
dict(
type='Collect',
keys=[
'img', 'gt_bboxes', 'gt_labels', 'gt_masks',
'gt_semantic_seg'
])
],
seg_prefix='stuffthingmaps/train',
classes=('1', '2', '3', '4', '5', '6',
'7'),
data_root='data/coco'),
val=dict(
type='CocoSubsetDataset',
ann_file='annotations/COCO-subset.json',
img_prefix='images/eval',
pipeline=[
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1333, 800),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.5),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img'])
])
],
classes=('1', '2', '3', '4', '5', '6',
'7'),
data_root='data/coco'),
test=dict(
type='CocoSubsetDataset',
ann_file='annotations/COCO-subset.json',
img_prefix='images/eval',
pipeline=[
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1333, 800),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.5),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img'])
])
],
classes=('1', '2', '3', '4', '5', '6',
'7'),
data_root='data/coco'))
evaluation = dict(metric=['bbox', 'segm'])
optimizer = dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.0001)
optimizer_config = dict(grad_clip=None)
lr_config = dict(
policy='step',
warmup='linear',
warmup_iters=500,
warmup_ratio=0.001,
step=[8, 11])
runner = dict(type='EpochBasedRunner', max_epochs=12)
checkpoint_config = dict(interval=1)
log_config = dict(interval=50, hooks=[dict(type='TextLoggerHook')])
custom_hooks = [dict(type='NumClassCheckHook')]
dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = None
resume_from = None
workflow = [('train', 1)]
model = dict(
type='HybridTaskCascade',
backbone=dict(
type='DetectoRS_ResNet',
depth=50,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=1,
norm_cfg=dict(type='BN', requires_grad=True),
norm_eval=True,
style='pytorch',
init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50'),
conv_cfg=dict(type='ConvAWS'),
sac=dict(type='SAC', use_deform=True),
stage_with_sac=(False, True, True, True),
output_img=True),
neck=dict(
type='RFP',
in_channels=[256, 512, 1024, 2048],
out_channels=256,
num_outs=5,
rfp_steps=2,
aspp_out_channels=64,
aspp_dilations=(1, 3, 6, 1),
rfp_backbone=dict(
rfp_inplanes=256,
type='DetectoRS_ResNet',
depth=50,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=1,
norm_cfg=dict(type='BN', requires_grad=True),
norm_eval=True,
conv_cfg=dict(type='ConvAWS'),
sac=dict(type='SAC', use_deform=True),
stage_with_sac=(False, True, True, True),
pretrained='torchvision://resnet50',
style='pytorch')),
rpn_head=dict(
type='RPNHead',
in_channels=256,
feat_channels=256,
anchor_generator=dict(
type='AnchorGenerator',
scales=[8],
ratios=[0.5, 1.0, 2.0],
strides=[4, 8, 16, 32, 64]),
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0.0, 0.0, 0.0, 0.0],
target_stds=[1.0, 1.0, 1.0, 1.0]),
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
loss_bbox=dict(
type='SmoothL1Loss', beta=0.1111111111111111, loss_weight=1.0)),
roi_head=dict(
type='HybridTaskCascadeRoIHead',
interleaved=True,
mask_info_flow=True,
num_stages=3,
stage_loss_weights=[1, 0.5, 0.25],
bbox_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
bbox_head=[
dict(type='Shared2FCBBoxHead', num_classes=7),
dict(type='Shared2FCBBoxHead', num_classes=7),
dict(type='Shared2FCBBoxHead', num_classes=7)
],
mask_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', output_size=14, sampling_ratio=0),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
mask_head=[
dict(
type='HTCMaskHead',
with_conv_res=False,
num_convs=4,
in_channels=256,
conv_out_channels=256,
num_classes=7,
loss_mask=dict(
type='CrossEntropyLoss', use_mask=True, loss_weight=1.0)),
dict(
type='HTCMaskHead',
num_convs=4,
in_channels=256,
conv_out_channels=256,
num_classes=7,
loss_mask=dict(
type='CrossEntropyLoss', use_mask=True, loss_weight=1.0)),
dict(
type='HTCMaskHead',
num_convs=4,
in_channels=256,
conv_out_channels=256,
num_classes=7,
loss_mask=dict(
type='CrossEntropyLoss', use_mask=True, loss_weight=1.0))
],
semantic_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', output_size=14, sampling_ratio=0),
out_channels=256,
featmap_strides=[8]),
semantic_head=dict(
type='FusedSemanticHead',
num_ins=5,
fusion_level=1,
num_convs=4,
in_channels=256,
conv_out_channels=256,
num_classes=183,
ignore_label=255,
loss_weight=0.2)),
train_cfg=dict(
rpn=dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.7,
neg_iou_thr=0.3,
min_pos_iou=0.3,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=256,
pos_fraction=0.5,
neg_pos_ub=-1,
add_gt_as_proposals=False),
allowed_border=0,
pos_weight=-1,
debug=False),
rpn_proposal=dict(
nms_pre=2000,
max_per_img=2000,
nms=dict(type='nms', iou_threshold=0.7),
min_bbox_size=0),
rcnn=[
dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.5,
neg_iou_thr=0.5,
min_pos_iou=0.5,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=512,
pos_fraction=0.25,
neg_pos_ub=-1,
add_gt_as_proposals=True),
mask_size=28,
pos_weight=-1,
debug=False),
dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.6,
neg_iou_thr=0.6,
min_pos_iou=0.6,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=512,
pos_fraction=0.25,
neg_pos_ub=-1,
add_gt_as_proposals=True),
mask_size=28,
pos_weight=-1,
debug=False),
dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.7,
neg_iou_thr=0.7,
min_pos_iou=0.7,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=512,
pos_fraction=0.25,
neg_pos_ub=-1,
add_gt_as_proposals=True),
mask_size=28,
pos_weight=-1,
debug=False)
]),
test_cfg=dict(
rpn=dict(
nms_pre=1000,
max_per_img=1000,
nms=dict(type='nms', iou_threshold=0.7),
min_bbox_size=0),
rcnn=dict(
score_thr=0.001,
nms=dict(type='nms', iou_threshold=0.5),
max_per_img=100,
mask_thr_binary=0.5)))
classes = ('1', '2', '3', '4', '5', '6', '7')
work_dir = '/content/gdrive/MyDrive/COCO Subset/DetectoRS-master/checkpoints'
seed = 0
gpu_ids = range(0, 1)
I don't really know if it helps. But I've printed:
a) the image from the results['filename']
;
b) the segmented image from stuffthingmaps, using results['seg_prefix']+'/'+results['ann_info']['seg_map']
;
c) the mask passed as parameter to mmcv.impad()
I printed both a) and b) from inside the _pad_masks
method of the Pad pipeline class, while c) is printed inside the _pad
function of the BitmapMasks class. Hope this is helpful. (This is for file 000000279522.jpg, using seed 0)
Another update. So I believe I encircled the problem to be the annotations file, but I still don't know exactly what. Using the same data structure with the annotation files from the original COCO Dataset results in no error, when the file is changed to the subset annotation (created through VIA exportation) the error occurs.
I still have no clue what could be causing it, since the the mask/bbox values are the same as the original COCO annotation.
Another update. So I believe I encircled the problem to be the annotations file, but I still don't know exactly what. Using the same data structure with the annotation files from the original COCO Dataset results in no error, when the file is changed to the subset annotation (created through VIA exportation) the error occurs.
I still have no clue what could be causing it, since the the mask/bbox values are the same as the original COCO annotation.
I suggest you convert the format exported by VIA to COCO format. It is difficult for me to judge where the problem is based on the current information.
Okay, I solved the problem. Don't know exactly what was it, but let me explain.
I suggest you convert the format exported by VIA to COCO format. It is difficult for me to judge where the problem is based on the current information.
The data was always in the COCO format, the VIA tool has a option to export annotations in the COCO format, which is what I used to create the subset. But something was indeed wrong with the file, I manually generated the subset using a script to filter the original COCO annotations file and the problem was solved.
I attach two files in this response coco_subset_train.json.txt and COCO-subset-train.json.txt, the latter is the annotation file generated by VIA, the former is the one created by filtering, in case anyone is curious to find out what is different about those, I couldn't find it. They are .txt because .json are not allowed here.
However I've noticed two new issues, I'm wondering if I should open new issues or could use this one, those are:
nan
using DetectoRS, this happens even when using the original COCO annotations. I couldn't finish one epoch with the full COCO dataset due to Colab time out. But here is the logger on the last iteration before time out:2021-09-01 09:06:21,235 - mmdet - INFO - Epoch [1][30400/58633] lr: 2.000e-02, eta: 17 days, 3:12:27, time: 2.195, data_time: 0.116, memory: 11767, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_semantic_seg: nan, s0.loss_cls: nan, s0.acc: 25.9840, s0.loss_bbox: nan, s0.loss_mask: nan, s1.loss_cls: nan, s1.acc: 25.9840, s1.loss_bbox: nan, s1.loss_mask: nan, s2.loss_cls: nan, s2.acc: 25.9840, s2.loss_bbox: nan, s2.loss_mask: nan, loss: nan
RuntimeError: CUDA error: device-side assert triggered
. Without this the train_detector runs (with only issue 1). I load weights to some layers, using:# Load and filter state_dict
loaded_dict = torch.load('/content/gdrive/MyDrive/Doutorado/COCO Subset/DetectoRS-master/model/detectors_htc_r50_1x_coco-329b1453.pth')
del_keys = []
for k in loaded_dict['state_dict'].keys():
if 'head' in k:
del_keys.append(k)
for k in del_keys:
loaded_dict['state_dict'].pop(k)
model.load_state_dict(loaded_dict['state_dict'], strict=False)
Also, when an epoch is finished and validation is run I get:
mmdet - ERROR - The testing results of the whole dataset is empty.
I'd like to point out again that I'm using this annotation, which is simply a subset of the COCO Dataset. Why would this happen?
coco_subset_train.json
is wrong. Instead, I successfully load the annotations with COCO-subset-train.json
and the problem you mentioned did not appear.For the losses, you can add grad_clip in the config file.
optimizer_config = dict(
_delete_=True, grad_clip=dict(max_norm=35, norm_type=2))
map_loaction
.First I would like to thank you for your time and help, since I'm new to mmdetection I may be understanding things wrong and your guidance has been clearing a lot for me.
1. I visualized both JSON file, but it seems that the label in `coco_subset_train.json` is wrong. Instead, I successfully load the annotations with `COCO-subset-train.json` and the problem you mentioned did not appear.
coco_subset_train.json
does not use all of the the classes. It uses the following: CLASSES = ['backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed', 'toilet', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 'toaster', 'book', 'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush']
However, the annotations within both files are identical, as I could check by executing the following code:
>>> f1 = "coco_subset_train.json"
>>> f2 = "COCO-subset-train.json"
>>> file1 = open(f1,'r')
>>> file2 = open(f2,'r')
>>> j1 = json.load(file1)
>>> j2 = json.load(file2)
>>> anns1 = j1['annotations']
>>> anns2 = j2['annotations']
>>> print(len(anns1), len(anns2))
6597 6662
>>> anns1_byId = {ann['id']: ann for ann in anns1}
>>> anns2_byId = {ann['id']: ann for ann in anns2}
>>> equals_segmentation = 0
>>> equals_bbox = 0
>>> equals_category_id = 0
>>> for k in anns1_byId.keys():
... a1 = anns1_byId[k]
... a2 = anns2_byId[k]
... if a1['segmentation'] == a2['segmentation']:
... equals_segmentation+=1
... if a1['bbox'] == a2['bbox']:
... equals_bbox+=1
... if a1['category_id'] == a2['category_id']:
... equals_category_id+=1
...
>>> print(equals_segmentation)
6597
>>> print(equals_bbox)
6597
>>> print(equals_category_id)
6597
2. For the losses, you can add grad_clip in the config file. ``` optimizer_config = dict( _delete_=True, grad_clip=dict(max_norm=35, norm_type=2)) ```
/content/mmdetection/mmdet/apis/train.py in train_detector(model, dataset, cfg, distributed, validate, timestamp, meta)
126 runner.register_training_hooks(cfg.lr_config, optimizer_config,
127 cfg.checkpoint_config, cfg.log_config,
--> 128 cfg.get('momentum_config', None))
129 if distributed:
130 if isinstance(runner, EpochBasedRunner):
/usr/local/lib/python3.7/dist-packages/mmcv/runner/base_runner.py in register_training_hooks(self, lr_config, optimizer_config, checkpoint_config, log_config, momentum_config, timer_config, custom_hooks_config)
536 will be triggered after default hooks.
537 """
--> 538 self.register_lr_hook(lr_config)
539 self.register_momentum_hook(momentum_config)
540 self.register_optimizer_hook(optimizer_config)
/usr/local/lib/python3.7/dist-packages/mmcv/runner/base_runner.py in register_lr_hook(self, lr_config)
418 else:
419 hook = lr_config
--> 420 self.register_hook(hook, priority='VERY_HIGH')
421
422 def register_momentum_hook(self, momentum_config):
/usr/local/lib/python3.7/dist-packages/mmcv/runner/base_runner.py in register_hook(self, hook, priority)
266 Lower value means higher priority.
267 """
--> 268 assert isinstance(hook, Hook)
269 if hasattr(hook, 'priority'):
270 raise ValueError('"priority" is a reserved attribute for hooks')
AssertionError:
3. For the loading weights, you can set its `map_loaction`.
I'm aware of the map_location
parameter, however the error does not occur during the loading of the weights, but when the training starts. I'll try it as soon as I can and will report back.
But most importantly, do you have any insight on why would the test metrics appear as mmdet - ERROR - The testing results of the whole dataset is empty.
?
But most importantly, do you have any insight on why would the test metrics appear as
mmdet - ERROR - The testing results of the whole dataset is empty.
?
This error occurs at L438 in mmdet/datasets/coco.py
, you can check what happened.
After adding this to the config file I get the following error:
This error shouldn't happen.
I noticed that you used your own dataset, why not use CocoDataset directly.
This error occurs at L438 in
mmdet/datasets/coco.py
, you can check what happened.
I've print some of the variable through evaluate
to check what could be and it seems results
is entering the evaluate()
method empty. I also noticed that indeed my model is returning an empty list in L28 of mmdet/apis/test.py
.
Json Prefix: /content/gdrive/MyDrive/Doutorado/COCO Subset/DetectoRS-master/checkpoints/subset
Metrics: ['bbox', 'segm']
Result Files: {'bbox': '/content/gdrive/MyDrive/Doutorado/COCO Subset/DetectoRS-master/checkpoints/subset.bbox.json', 'proposal': '/content/gdrive/MyDrive/Doutorado/COCO Subset/DetectoRS-master/checkpoints/subset.bbox.json', 'segm': '/content/gdrive/MyDrive/Doutorado/COCO Subset/DetectoRS-master/checkpoints/subset.segm.json'}
Eval Results: OrderedDict()
Results: [([array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32)], [[], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], []]), #I'm trucating this because it's too long but it just repeats)]
But the input Data seems to be ok:
Data: {'img_metas': [DataContainer([[{'filename': 'data/coco/images/eval/000000073922.jpg', 'ori_filename': '000000073922.jpg', 'ori_shape': (491, 640, 3), 'img_shape': (800, 1043, 3), 'pad_shape': (800, 1056, 3), 'scale_factor': array([1.6296875, 1.6293279, 1.6296875, 1.6293279], dtype=float32), 'flip': False, 'flip_direction': None, 'img_norm_cfg': {'mean': array([123.675, 116.28 , 103.53 ], dtype=float32), 'std': array([58.395, 57.12 , 57.375], dtype=float32), 'to_rgb': True}, 'batch_input_shape': (800, 1056)}]])], 'img': [tensor([[[[-0.3198, -0.2513, -0.1657, ..., 0.0000, 0.0000, 0.0000],
[-0.3198, -0.2684, -0.1999, ..., 0.0000, 0.0000, 0.0000],
[-0.3027, -0.2856, -0.2342, ..., 0.0000, 0.0000, 0.0000],
...,
[-1.9980, -2.0152, -2.0152, ..., 0.0000, 0.0000, 0.0000],
[-1.9980, -1.9980, -2.0152, ..., 0.0000, 0.0000, 0.0000],
[-1.9809, -1.9980, -2.0152, ..., 0.0000, 0.0000, 0.0000]],
[[-0.2150, -0.1450, -0.0399, ..., 0.0000, 0.0000, 0.0000],
[-0.2150, -0.1450, -0.0749, ..., 0.0000, 0.0000, 0.0000],
[-0.1975, -0.1625, -0.1099, ..., 0.0000, 0.0000, 0.0000],
...,
[-1.5455, -1.5630, -1.5630, ..., 0.0000, 0.0000, 0.0000],
[-1.5280, -1.5455, -1.5630, ..., 0.0000, 0.0000, 0.0000],
[-1.5280, -1.5455, -1.5630, ..., 0.0000, 0.0000, 0.0000]],
[[-0.5147, -0.4798, -0.4101, ..., 0.0000, 0.0000, 0.0000],
[-0.5147, -0.4798, -0.4450, ..., 0.0000, 0.0000, 0.0000],
[-0.4973, -0.4973, -0.4798, ..., 0.0000, 0.0000, 0.0000],
...,
[-1.4036, -1.4036, -1.4210, ..., 0.0000, 0.0000, 0.0000],
[-1.4036, -1.4036, -1.4210, ..., 0.0000, 0.0000, 0.0000],
[-1.3861, -1.4036, -1.4210, ..., 0.0000, 0.0000, 0.0000]]]])]}
This error shouldn't happen.
I noticed that you used your own dataset, why not use CocoDataset directly.
I changed the dataset class to CocoDataset and fixed the problem with optimizer_config
, I believe the problem was I was adding _delete_ = True
during execution, not directly at the config file. However, after the changes the loss remains as nan
.
/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at /pytorch/c10/core/TensorImpl.h:1156.)
return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
/content/mmdetection/mmdet/core/anchor/anchor_generator.py:324: UserWarning: ``grid_anchors`` would be deprecated soon. Please use ``grid_priors``
warnings.warn('``grid_anchors`` would be deprecated soon. '
/content/mmdetection/mmdet/core/anchor/anchor_generator.py:361: UserWarning: ``single_level_grid_anchors`` would be deprecated soon. Please use ``single_level_grid_priors``
'``single_level_grid_anchors`` would be deprecated soon. '
2021-09-05 15:28:18,707 - mmdet - INFO - Epoch [1][50/501] lr: 1.978e-03, eta: 3:26:02, time: 2.074, data_time: 0.148, memory: 10091, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_semantic_seg: nan, s0.loss_cls: nan, s0.acc: 0.7206, s0.loss_bbox: nan, s0.loss_mask: nan, s1.loss_cls: nan, s1.acc: 0.7206, s1.loss_bbox: nan, s1.loss_mask: nan, s2.loss_cls: nan, s2.acc: 0.7206, s2.loss_bbox: nan, s2.loss_mask: nan, loss: nan
2021-09-05 15:30:05,031 - mmdet - INFO - Epoch [1][100/501] lr: 3.976e-03, eta: 3:26:55, time: 2.126, data_time: 0.124, memory: 10263, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_semantic_seg: nan, s0.loss_cls: nan, s0.acc: 1.0023, s0.loss_bbox: nan, s0.loss_mask: nan, s1.loss_cls: nan, s1.acc: 1.0023, s1.loss_bbox: nan, s1.loss_mask: nan, s2.loss_cls: nan, s2.acc: 1.0023, s2.loss_bbox: nan, s2.loss_mask: nan, loss: nan
2021-09-05 15:31:54,955 - mmdet - INFO - Epoch [1][150/501] lr: 5.974e-03, eta: 3:28:22, time: 2.199, data_time: 0.106, memory: 10263, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_semantic_seg: nan, s0.loss_cls: nan, s0.acc: 0.1538, s0.loss_bbox: nan, s0.loss_mask: nan, s1.loss_cls: nan, s1.acc: 0.1538, s1.loss_bbox: nan, s1.loss_mask: nan, s2.loss_cls: nan, s2.acc: 0.1538, s2.loss_bbox: nan, s2.loss_mask: nan, loss: nan
2021-09-05 15:33:41,815 - mmdet - INFO - Epoch [1][200/501] lr: 7.972e-03, eta: 3:26:42, time: 2.137, data_time: 0.100, memory: 10263, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_semantic_seg: nan, s0.loss_cls: nan, s0.acc: 0.0952, s0.loss_bbox: nan, s0.loss_mask: nan, s1.loss_cls: nan, s1.acc: 0.0952, s1.loss_bbox: nan, s1.loss_mask: nan, s2.loss_cls: nan, s2.acc: 0.0952, s2.loss_bbox: nan, s2.loss_mask: nan, loss: nan
2021-09-05 15:35:26,944 - mmdet - INFO - Epoch [1][250/501] lr: 9.970e-03, eta: 3:24:19, time: 2.103, data_time: 0.101, memory: 10263, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_semantic_seg: nan, s0.loss_cls: nan, s0.acc: 1.3275, s0.loss_bbox: nan, s0.loss_mask: nan, s1.loss_cls: nan, s1.acc: 1.3275, s1.loss_bbox: nan, s1.loss_mask: nan, s2.loss_cls: nan, s2.acc: 1.3275, s2.loss_bbox: nan, s2.loss_mask: nan, loss: nan
2021-09-05 15:37:13,353 - mmdet - INFO - Epoch [1][300/501] lr: 1.197e-02, eta: 3:22:33, time: 2.128, data_time: 0.100, memory: 10263, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_semantic_seg: nan, s0.loss_cls: nan, s0.acc: 0.8386, s0.loss_bbox: nan, s0.loss_mask: nan, s1.loss_cls: nan, s1.acc: 0.8386, s1.loss_bbox: nan, s1.loss_mask: nan, s2.loss_cls: nan, s2.acc: 0.8386, s2.loss_bbox: nan, s2.loss_mask: nan, loss: nan
2021-09-05 15:39:00,732 - mmdet - INFO - Epoch [1][350/501] lr: 1.397e-02, eta: 3:21:03, time: 2.148, data_time: 0.101, memory: 10263, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_semantic_seg: nan, s0.loss_cls: nan, s0.acc: 0.1250, s0.loss_bbox: nan, s0.loss_mask: nan, s1.loss_cls: nan, s1.acc: 0.1250, s1.loss_bbox: nan, s1.loss_mask: nan, s2.loss_cls: nan, s2.acc: 0.1250, s2.loss_bbox: nan, s2.loss_mask: nan, loss: nan
2021-09-05 15:40:49,694 - mmdet - INFO - Epoch [1][400/501] lr: 1.596e-02, eta: 3:19:50, time: 2.179, data_time: 0.103, memory: 10263, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_semantic_seg: nan, s0.loss_cls: nan, s0.acc: 0.3214, s0.loss_bbox: nan, s0.loss_mask: nan, s1.loss_cls: nan, s1.acc: 0.3214, s1.loss_bbox: nan, s1.loss_mask: nan, s2.loss_cls: nan, s2.acc: 0.3214, s2.loss_bbox: nan, s2.loss_mask: nan, loss: nan
2021-09-05 15:42:35,815 - mmdet - INFO - Epoch [1][450/501] lr: 1.796e-02, eta: 3:17:55, time: 2.122, data_time: 0.098, memory: 10263, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_semantic_seg: nan, s0.loss_cls: nan, s0.acc: 0.5714, s0.loss_bbox: nan, s0.loss_mask: nan, s1.loss_cls: nan, s1.acc: 0.5714, s1.loss_bbox: nan, s1.loss_mask: nan, s2.loss_cls: nan, s2.acc: 0.5714, s2.loss_bbox: nan, s2.loss_mask: nan, loss: nan
2021-09-05 15:44:23,355 - mmdet - INFO - Epoch [1][500/501] lr: 1.996e-02, eta: 3:16:17, time: 2.151, data_time: 0.104, memory: 10263, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_semantic_seg: nan, s0.loss_cls: nan, s0.acc: 0.4469, s0.loss_bbox: nan, s0.loss_mask: nan, s1.loss_cls: nan, s1.acc: 0.4469, s1.loss_bbox: nan, s1.loss_mask: nan, s2.loss_cls: nan, s2.acc: 0.4469, s2.loss_bbox: nan, s2.loss_mask: nan, loss: nan
2021-09-05 15:44:25,330 - mmdet - INFO - Saving checkpoint at 1 epochs
I'm truly at a loss on what could be happening here. My datasets seems to be ok, you can see as I printed it: Train:
CocoDataset Train dataset with number of images 1001, and instance counts:
+-------------------+-------+---------------------+-------+-----------------+-------+-----------------+-------+--------------------+-------+
| category | count | category | count | category | count | category | count | category | count |
+-------------------+-------+---------------------+-------+-----------------+-------+-----------------+-------+--------------------+-------+
| 0 [backpack] | 95 | 1 [umbrella] | 102 | 2 [handbag] | 114 | 3 [tie] | 63 | 4 [suitcase] | 80 |
| 5 [frisbee] | 23 | 6 [skis] | 70 | 7 [snowboard] | 36 | 8 [sports ball] | 72 | 9 [kite] | 109 |
| 10 [baseball bat] | 41 | 11 [baseball glove] | 52 | 12 [skateboard] | 37 | 13 [surfboard] | 44 | 14 [tennis racket] | 34 |
| 15 [bottle] | 577 | 16 [wine glass] | 202 | 17 [cup] | 479 | 18 [fork] | 130 | 19 [knife] | 178 |
| 20 [spoon] | 152 | 21 [bowl] | 296 | 22 [banana] | 197 | 23 [apple] | 124 | 24 [sandwich] | 117 |
| 25 [orange] | 127 | 26 [broccoli] | 192 | 27 [carrot] | 237 | 28 [hot dog] | 85 | 29 [pizza] | 76 |
| 30 [donut] | 146 | 31 [cake] | 120 | 32 [chair] | 670 | 33 [couch] | 117 | 34 [potted plant] | 126 |
| 35 [bed] | 55 | 36 [toilet] | 34 | 37 [laptop] | 107 | 38 [mouse] | 62 | 39 [remote] | 94 |
| 40 [keyboard] | 70 | 41 [cell phone] | 57 | 42 [toaster] | 23 | 43 [book] | 427 | 44 [clock] | 65 |
| 45 [vase] | 109 | 46 [scissors] | 51 | 47 [teddy bear] | 56 | 48 [hair drier] | 20 | 49 [toothbrush] | 46 |
+-------------------+-------+---------------------+-------+-----------------+-------+-----------------+-------+--------------------+-------+]
Test/Val:
CocoDataset Train dataset with number of images 492, and instance counts:
+-------------------+-------+---------------------+-------+-----------------+-------+-----------------+-------+--------------------+-------+
| category | count | category | count | category | count | category | count | category | count |
+-------------------+-------+---------------------+-------+-----------------+-------+-----------------+-------+--------------------+-------+
| 0 [backpack] | 55 | 1 [umbrella] | 75 | 2 [handbag] | 70 | 3 [tie] | 26 | 4 [suitcase] | 26 |
| 5 [frisbee] | 14 | 6 [skis] | 33 | 7 [snowboard] | 13 | 8 [sports ball] | 65 | 9 [kite] | 9 |
| 10 [baseball bat] | 39 | 11 [baseball glove] | 37 | 12 [skateboard] | 8 | 13 [surfboard] | 29 | 14 [tennis racket] | 28 |
| 15 [bottle] | 225 | 16 [wine glass] | 50 | 17 [cup] | 260 | 18 [fork] | 76 | 19 [knife] | 132 |
| 20 [spoon] | 80 | 21 [bowl] | 161 | 22 [banana] | 62 | 23 [apple] | 108 | 24 [sandwich] | 51 |
| 25 [orange] | 77 | 26 [broccoli] | 39 | 27 [carrot] | 89 | 28 [hot dog] | 29 | 29 [pizza] | 40 |
| 30 [donut] | 58 | 31 [cake] | 40 | 32 [chair] | 326 | 33 [couch] | 59 | 34 [potted plant] | 58 |
| 35 [bed] | 26 | 36 [toilet] | 20 | 37 [laptop] | 49 | 38 [mouse] | 29 | 39 [remote] | 51 |
| 40 [keyboard] | 35 | 41 [cell phone] | 51 | 42 [toaster] | 11 | 43 [book] | 262 | 44 [clock] | 33 |
| 45 [vase] | 51 | 46 [scissors] | 23 | 47 [teddy bear] | 35 | 48 [hair drier] | 10 | 49 [toothbrush] | 32 |
+-------------------+-------+---------------------+-------+-----------------+-------+-----------------+-------+--------------------+-------+
I've redone the annotation files using only pycocotools, I'm uploading the files here again (coco_subset_eval.txt, coco_subset_train.txt). Here is my current config, so it can be verified that everything is ok:
Config:
dataset_type = 'CocoDataset'
data_root = 'data/coco'
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='LoadAnnotations', with_bbox=True, with_mask=True, with_seg=True),
dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.5),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='SegRescale', scale_factor=0.125),
dict(type='DefaultFormatBundle'),
dict(
type='Collect',
keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks', 'gt_semantic_seg'])
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1333, 800),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.5),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img'])
])
]
data = dict(
samples_per_gpu=2,
workers_per_gpu=0,
train=dict(
type='CocoDataset',
ann_file='annotations/coco_subset_train.json',
img_prefix='images/train',
pipeline=[
dict(type='LoadImageFromFile'),
dict(
type='LoadAnnotations',
with_bbox=True,
with_mask=True,
with_seg=True),
dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.5),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='SegRescale', scale_factor=0.125),
dict(type='DefaultFormatBundle'),
dict(
type='Collect',
keys=[
'img', 'gt_bboxes', 'gt_labels', 'gt_masks',
'gt_semantic_seg'
])
],
seg_prefix='stuffthingmaps/train',
classes=('backpack', 'umbrella', 'handbag', 'tie', 'suitcase',
'frisbee', 'skis', 'snowboard', 'sports ball', 'kite',
'baseball bat', 'baseball glove', 'skateboard', 'surfboard',
'tennis racket', 'bottle', 'wine glass', 'cup', 'fork',
'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich',
'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut',
'cake', 'chair', 'couch', 'potted plant', 'bed', 'toilet',
'laptop', 'mouse', 'remote', 'keyboard', 'cell phone',
'toaster', 'book', 'clock', 'vase', 'scissors', 'teddy bear',
'hair drier', 'toothbrush'),
data_root='data/coco'),
val=dict(
type='CocoDataset',
ann_file='annotations/coco_subset_eval.json',
img_prefix='images/eval',
pipeline=[
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1333, 800),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.5),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img'])
])
],
classes=('backpack', 'umbrella', 'handbag', 'tie', 'suitcase',
'frisbee', 'skis', 'snowboard', 'sports ball', 'kite',
'baseball bat', 'baseball glove', 'skateboard', 'surfboard',
'tennis racket', 'bottle', 'wine glass', 'cup', 'fork',
'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich',
'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut',
'cake', 'chair', 'couch', 'potted plant', 'bed', 'toilet',
'laptop', 'mouse', 'remote', 'keyboard', 'cell phone',
'toaster', 'book', 'clock', 'vase', 'scissors', 'teddy bear',
'hair drier', 'toothbrush'),
data_root='data/coco'),
test=dict(
type='CocoDataset',
ann_file='annotations/coco_subset_eval.json',
img_prefix='images/eval',
pipeline=[
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1333, 800),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.5),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img'])
])
],
classes=('backpack', 'umbrella', 'handbag', 'tie', 'suitcase',
'frisbee', 'skis', 'snowboard', 'sports ball', 'kite',
'baseball bat', 'baseball glove', 'skateboard', 'surfboard',
'tennis racket', 'bottle', 'wine glass', 'cup', 'fork',
'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich',
'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut',
'cake', 'chair', 'couch', 'potted plant', 'bed', 'toilet',
'laptop', 'mouse', 'remote', 'keyboard', 'cell phone',
'toaster', 'book', 'clock', 'vase', 'scissors', 'teddy bear',
'hair drier', 'toothbrush'),
data_root='data/coco'))
evaluation = dict(
metric=['bbox', 'segm'],
by_epoch=True,
jsonfile_prefix=
'/content/gdrive/MyDrive/Doutorado/COCO Subset/DetectoRS-master/checkpoints/subset'
)
optimizer = dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.0001)
optimizer_config = dict(grad_clip=None)
lr_config = dict(
policy='step',
warmup='linear',
warmup_iters=500,
warmup_ratio=0.001,
step=[8, 11])
runner = dict(type='EpochBasedRunner', max_epochs=12)
checkpoint_config = dict(interval=1)
log_config = dict(interval=50, hooks=[dict(type='TextLoggerHook')])
custom_hooks = [dict(type='NumClassCheckHook')]
dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = None
resume_from = None
workflow = [('train', 1)]
model = dict(
type='HybridTaskCascade',
backbone=dict(
type='DetectoRS_ResNet',
depth=50,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=1,
norm_cfg=dict(type='BN', requires_grad=True),
norm_eval=True,
style='pytorch',
init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50'),
conv_cfg=dict(type='ConvAWS'),
sac=dict(type='SAC', use_deform=True),
stage_with_sac=(False, True, True, True),
output_img=True),
neck=dict(
type='RFP',
in_channels=[256, 512, 1024, 2048],
out_channels=256,
num_outs=5,
rfp_steps=2,
aspp_out_channels=64,
aspp_dilations=(1, 3, 6, 1),
rfp_backbone=dict(
rfp_inplanes=256,
type='DetectoRS_ResNet',
depth=50,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=1,
norm_cfg=dict(type='BN', requires_grad=True),
norm_eval=True,
conv_cfg=dict(type='ConvAWS'),
sac=dict(type='SAC', use_deform=True),
stage_with_sac=(False, True, True, True),
pretrained='torchvision://resnet50',
style='pytorch')),
rpn_head=dict(
type='RPNHead',
in_channels=256,
feat_channels=256,
anchor_generator=dict(
type='AnchorGenerator',
scales=[8],
ratios=[0.5, 1.0, 2.0],
strides=[4, 8, 16, 32, 64]),
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0.0, 0.0, 0.0, 0.0],
target_stds=[1.0, 1.0, 1.0, 1.0]),
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
loss_bbox=dict(
type='SmoothL1Loss', beta=0.1111111111111111, loss_weight=1.0)),
roi_head=dict(
type='HybridTaskCascadeRoIHead',
interleaved=True,
mask_info_flow=True,
num_stages=3,
stage_loss_weights=[1, 0.5, 0.25],
bbox_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
bbox_head=[
dict(type='Shared2FCBBoxHead', num_classes=50),
dict(type='Shared2FCBBoxHead', num_classes=50),
dict(type='Shared2FCBBoxHead', num_classes=50)
],
mask_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', output_size=14, sampling_ratio=0),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
mask_head=[
dict(
type='HTCMaskHead',
with_conv_res=False,
num_convs=4,
in_channels=256,
conv_out_channels=256,
num_classes=50,
loss_mask=dict(
type='CrossEntropyLoss', use_mask=True, loss_weight=1.0)),
dict(
type='HTCMaskHead',
num_convs=4,
in_channels=256,
conv_out_channels=256,
num_classes=50,
loss_mask=dict(
type='CrossEntropyLoss', use_mask=True, loss_weight=1.0)),
dict(
type='HTCMaskHead',
num_convs=4,
in_channels=256,
conv_out_channels=256,
num_classes=50,
loss_mask=dict(
type='CrossEntropyLoss', use_mask=True, loss_weight=1.0))
],
semantic_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', output_size=14, sampling_ratio=0),
out_channels=256,
featmap_strides=[8]),
semantic_head=dict(
type='FusedSemanticHead',
num_ins=5,
fusion_level=1,
num_convs=4,
in_channels=256,
conv_out_channels=256,
num_classes=183,
loss_seg=dict(
type='CrossEntropyLoss', ignore_index=255, loss_weight=0.2))),
train_cfg=dict(
rpn=dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.7,
neg_iou_thr=0.3,
min_pos_iou=0.3,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=256,
pos_fraction=0.5,
neg_pos_ub=-1,
add_gt_as_proposals=False),
allowed_border=0,
pos_weight=-1,
debug=False),
rpn_proposal=dict(
nms_pre=2000,
max_per_img=2000,
nms=dict(type='nms', iou_threshold=0.7),
min_bbox_size=0),
rcnn=[
dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.5,
neg_iou_thr=0.5,
min_pos_iou=0.5,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=512,
pos_fraction=0.25,
neg_pos_ub=-1,
add_gt_as_proposals=True),
mask_size=28,
pos_weight=-1,
debug=False),
dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.6,
neg_iou_thr=0.6,
min_pos_iou=0.6,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=512,
pos_fraction=0.25,
neg_pos_ub=-1,
add_gt_as_proposals=True),
mask_size=28,
pos_weight=-1,
debug=False),
dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.7,
neg_iou_thr=0.7,
min_pos_iou=0.7,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=512,
pos_fraction=0.25,
neg_pos_ub=-1,
add_gt_as_proposals=True),
mask_size=28,
pos_weight=-1,
debug=False)
]),
test_cfg=dict(
rpn=dict(
nms_pre=1000,
max_per_img=1000,
nms=dict(type='nms', iou_threshold=0.7),
min_bbox_size=0),
rcnn=dict(
score_thr=0.001,
nms=dict(type='nms', iou_threshold=0.5),
max_per_img=100,
mask_thr_binary=0.5)))
classes = ('backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee',
'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat',
'baseball glove', 'skateboard', 'surfboard', 'tennis racket',
'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl',
'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot',
'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch',
'potted plant', 'bed', 'toilet', 'laptop', 'mouse', 'remote',
'keyboard', 'cell phone', 'toaster', 'book', 'clock', 'vase',
'scissors', 'teddy bear', 'hair drier', 'toothbrush')
work_dir = '/content/gdrive/MyDrive/Doutorado/COCO Subset/DetectoRS-master/checkpoints/subset'
seed = 0
gpu_ids = range(0, 1)
Hello, I'm bringing another update. I have done many tests to try and find a solution for my problem. But none were successful. One odd event was when I trained the model with the pretrained checkpoint. I got the following log:
2021-09-13 17:51:02,460 - mmdet - INFO - Epoch [1][50/1001] lr: 1.978e-03, eta: 13:30:57, time: 4.068, data_time: 0.094, memory: 7942, loss_rpn_cls: 0.0244, loss_rpn_bbox: 0.0124, loss_semantic_seg: 0.1777, s0.loss_cls: 1.4451, s0.acc: 74.4570, s0.loss_bbox: 0.0751, s0.loss_mask: 0.6582, s1.loss_cls: 0.8104, s1.acc: 71.0209, s1.loss_bbox: 0.0830, s1.loss_mask: 0.3728, s2.loss_cls: 0.4637, s2.acc: 66.1233, s2.loss_bbox: 0.0558, s2.loss_mask: 0.1852, loss: 4.3637
2021-09-13 17:54:17,287 - mmdet - INFO - Epoch [1][100/1001] lr: 3.976e-03, eta: 13:10:34, time: 3.897, data_time: 0.049, memory: 7942, loss_rpn_cls: 0.0309, loss_rpn_bbox: 0.0161, loss_semantic_seg: 0.2387, s0.loss_cls: 0.5302, s0.acc: 89.2500, s0.loss_bbox: 0.0579, s0.loss_mask: 0.5119, s1.loss_cls: 0.3080, s1.acc: 87.5830, s1.loss_bbox: 0.0676, s1.loss_mask: 0.2740, s2.loss_cls: 0.1627, s2.acc: 87.4562, s2.loss_bbox: 0.0447, s2.loss_mask: 0.1403, loss: 2.3830
2021-09-13 17:57:20,034 - mmdet - INFO - Epoch [1][150/1001] lr: 5.974e-03, eta: 12:45:41, time: 3.655, data_time: 0.056, memory: 7942, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_semantic_seg: nan, s0.loss_cls: nan, s0.acc: 75.4767, s0.loss_bbox: nan, s0.loss_mask: nan, s1.loss_cls: nan, s1.acc: 75.0148, s1.loss_bbox: nan, s1.loss_mask: nan, s2.loss_cls: nan, s2.acc: 75.4754, s2.loss_bbox: nan, s2.loss_mask: 5.8979, loss: nan
2021-09-13 17:59:42,359 - mmdet - INFO - Epoch [1][200/1001] lr: 7.972e-03, eta: 11:51:56, time: 2.847, data_time: 0.054, memory: 7942, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_semantic_seg: nan, s0.loss_cls: nan, s0.acc: 2.6500, s0.loss_bbox: nan, s0.loss_mask: nan, s1.loss_cls: nan, s1.acc: 2.6500, s1.loss_bbox: nan, s1.loss_mask: nan, s2.loss_cls: nan, s2.acc: 2.6500, s2.loss_bbox: nan, s2.loss_mask: 0.7845, loss: nan
2021-09-13 18:02:05,137 - mmdet - INFO - Epoch [1][250/1001] lr: 9.970e-03, eta: 11:19:06, time: 2.856, data_time: 0.057, memory: 7942, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_semantic_seg: nan, s0.loss_cls: nan, s0.acc: 1.3833, s0.loss_bbox: nan, s0.loss_mask: nan, s1.loss_cls: nan, s1.acc: 1.3833, s1.loss_bbox: nan, s1.loss_mask: nan, s2.loss_cls: nan, s2.acc: 1.3833, s2.loss_bbox: nan, s2.loss_mask: 0.1729, loss: nan
2021-09-13 18:04:27,156 - mmdet - INFO - Epoch [1][300/1001] lr: 1.197e-02, eta: 10:55:55, time: 2.840, data_time: 0.052, memory: 7942, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_semantic_seg: nan, s0.loss_cls: nan, s0.acc: 3.9429, s0.loss_bbox: nan, s0.loss_mask: nan, s1.loss_cls: nan, s1.acc: 3.9429, s1.loss_bbox: nan, s1.loss_mask: nan, s2.loss_cls: nan, s2.acc: 3.9429, s2.loss_bbox: nan, s2.loss_mask: 0.1725, loss: nan
2021-09-13 18:06:50,738 - mmdet - INFO - Epoch [1][350/1001] lr: 1.397e-02, eta: 10:39:33, time: 2.872, data_time: 0.055, memory: 7942, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_semantic_seg: nan, s0.loss_cls: nan, s0.acc: 2.4000, s0.loss_bbox: nan, s0.loss_mask: nan, s1.loss_cls: nan, s1.acc: 2.4000, s1.loss_bbox: nan, s1.loss_mask: nan, s2.loss_cls: nan, s2.acc: 2.4000, s2.loss_bbox: nan, s2.loss_mask: 0.1725, loss: nan
2021-09-13 18:09:13,892 - mmdet - INFO - Epoch [1][400/1001] lr: 1.596e-02, eta: 10:26:28, time: 2.863, data_time: 0.055, memory: 7942, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_semantic_seg: nan, s0.loss_cls: nan, s0.acc: 1.0000, s0.loss_bbox: nan, s0.loss_mask: nan, s1.loss_cls: nan, s1.acc: 1.0000, s1.loss_bbox: nan, s1.loss_mask: nan, s2.loss_cls: nan, s2.acc: 1.0000, s2.loss_bbox: nan, s2.loss_mask: 0.1718, loss: nan
2021-09-13 18:11:38,207 - mmdet - INFO - Epoch [1][450/1001] lr: 1.796e-02, eta: 10:16:15, time: 2.886, data_time: 0.054, memory: 7942, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_semantic_seg: nan, s0.loss_cls: nan, s0.acc: 0.3750, s0.loss_bbox: nan, s0.loss_mask: nan, s1.loss_cls: nan, s1.acc: 0.3750, s1.loss_bbox: nan, s1.loss_mask: nan, s2.loss_cls: nan, s2.acc: 0.3750, s2.loss_bbox: nan, s2.loss_mask: 0.1719, loss: nan
2021-09-13 18:13:58,708 - mmdet - INFO - Epoch [1][500/1001] lr: 1.996e-02, eta: 10:06:09, time: 2.810, data_time: 0.051, memory: 7942, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_semantic_seg: nan, s0.loss_cls: nan, s0.acc: 0.4000, s0.loss_bbox: nan, s0.loss_mask: nan, s1.loss_cls: nan, s1.acc: 0.4000, s1.loss_bbox: nan, s1.loss_mask: nan, s2.loss_cls: nan, s2.acc: 0.4000, s2.loss_bbox: nan, s2.loss_mask: 0.1713, loss: nan
2021-09-13 18:16:19,442 - mmdet - INFO - Epoch [1][550/1001] lr: 2.000e-02, eta: 9:57:32, time: 2.815, data_time: 0.050, memory: 7942, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_semantic_seg: nan, s0.loss_cls: nan, s0.acc: 1.3000, s0.loss_bbox: nan, s0.loss_mask: nan, s1.loss_cls: nan, s1.acc: 1.3000, s1.loss_bbox: nan, s1.loss_mask: nan, s2.loss_cls: nan, s2.acc: 1.3000, s2.loss_bbox: nan, s2.loss_mask: 0.1716, loss: nan
2021-09-13 18:18:43,594 - mmdet - INFO - Epoch [1][600/1001] lr: 2.000e-02, eta: 9:51:03, time: 2.883, data_time: 0.057, memory: 7942, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_semantic_seg: nan, s0.loss_cls: nan, s0.acc: 1.2988, s0.loss_bbox: nan, s0.loss_mask: nan, s1.loss_cls: nan, s1.acc: 1.2988, s1.loss_bbox: nan, s1.loss_mask: nan, s2.loss_cls: nan, s2.acc: 1.2988, s2.loss_bbox: nan, s2.loss_mask: 0.1708, loss: nan
2021-09-13 18:21:07,244 - mmdet - INFO - Epoch [1][650/1001] lr: 2.000e-02, eta: 9:45:02, time: 2.873, data_time: 0.054, memory: 7942, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_semantic_seg: nan, s0.loss_cls: nan, s0.acc: 2.9333, s0.loss_bbox: nan, s0.loss_mask: nan, s1.loss_cls: nan, s1.acc: 2.9333, s1.loss_bbox: nan, s1.loss_mask: nan, s2.loss_cls: nan, s2.acc: 2.9333, s2.loss_bbox: nan, s2.loss_mask: 0.1702, loss: nan
2021-09-13 18:23:30,588 - mmdet - INFO - Epoch [1][700/1001] lr: 2.000e-02, eta: 9:39:28, time: 2.867, data_time: 0.054, memory: 7942, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_semantic_seg: nan, s0.loss_cls: nan, s0.acc: 2.0667, s0.loss_bbox: nan, s0.loss_mask: nan, s1.loss_cls: nan, s1.acc: 2.0667, s1.loss_bbox: nan, s1.loss_mask: nan, s2.loss_cls: nan, s2.acc: 2.0667, s2.loss_bbox: nan, s2.loss_mask: 0.1702, loss: nan
2021-09-13 18:25:50,656 - mmdet - INFO - Epoch [1][750/1001] lr: 2.000e-02, eta: 9:33:30, time: 2.801, data_time: 0.050, memory: 7942, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_semantic_seg: nan, s0.loss_cls: nan, s0.acc: 2.3000, s0.loss_bbox: nan, s0.loss_mask: nan, s1.loss_cls: nan, s1.acc: 2.3000, s1.loss_bbox: nan, s1.loss_mask: nan, s2.loss_cls: nan, s2.acc: 2.3000, s2.loss_bbox: nan, s2.loss_mask: 0.1701, loss: nan
2021-09-13 18:28:12,592 - mmdet - INFO - Epoch [1][800/1001] lr: 2.000e-02, eta: 9:28:25, time: 2.839, data_time: 0.052, memory: 7942, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_semantic_seg: nan, s0.loss_cls: nan, s0.acc: 3.9000, s0.loss_bbox: nan, s0.loss_mask: nan, s1.loss_cls: nan, s1.acc: 3.9000, s1.loss_bbox: nan, s1.loss_mask: nan, s2.loss_cls: nan, s2.acc: 3.9000, s2.loss_bbox: nan, s2.loss_mask: 0.1696, loss: nan
2021-09-13 18:30:35,744 - mmdet - INFO - Epoch [1][850/1001] lr: 2.000e-02, eta: 9:23:56, time: 2.863, data_time: 0.055, memory: 7942, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_semantic_seg: nan, s0.loss_cls: nan, s0.acc: 2.4018, s0.loss_bbox: nan, s0.loss_mask: nan, s1.loss_cls: nan, s1.acc: 2.4018, s1.loss_bbox: nan, s1.loss_mask: nan, s2.loss_cls: nan, s2.acc: 2.4018, s2.loss_bbox: nan, s2.loss_mask: 0.1692, loss: nan
2021-09-13 18:32:59,935 - mmdet - INFO - Epoch [1][900/1001] lr: 2.000e-02, eta: 9:19:53, time: 2.884, data_time: 0.055, memory: 7942, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_semantic_seg: nan, s0.loss_cls: nan, s0.acc: 3.3922, s0.loss_bbox: nan, s0.loss_mask: nan, s1.loss_cls: nan, s1.acc: 3.3922, s1.loss_bbox: nan, s1.loss_mask: nan, s2.loss_cls: nan, s2.acc: 3.3922, s2.loss_bbox: nan, s2.loss_mask: 0.1688, loss: nan
2021-09-13 18:35:22,486 - mmdet - INFO - Epoch [1][950/1001] lr: 2.000e-02, eta: 9:15:42, time: 2.851, data_time: 0.052, memory: 7942, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_semantic_seg: nan, s0.loss_cls: nan, s0.acc: 3.0476, s0.loss_bbox: nan, s0.loss_mask: nan, s1.loss_cls: nan, s1.acc: 3.0476, s1.loss_bbox: nan, s1.loss_mask: nan, s2.loss_cls: nan, s2.acc: 3.0476, s2.loss_bbox: nan, s2.loss_mask: 0.1686, loss: nan
2021-09-13 18:37:45,852 - mmdet - INFO - Epoch [1][1000/1001] lr: 2.000e-02, eta: 9:11:50, time: 2.867, data_time: 0.059, memory: 7942, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_semantic_seg: nan, s0.loss_cls: nan, s0.acc: 1.6538, s0.loss_bbox: nan, s0.loss_mask: nan, s1.loss_cls: nan, s1.acc: 1.6538, s1.loss_bbox: nan, s1.loss_mask: nan, s2.loss_cls: nan, s2.acc: 1.6538, s2.loss_bbox: nan, s2.loss_mask: 0.1673, loss: nan
2021-09-13 18:37:48,574 - mmdet - INFO - Saving checkpoint at 1 epochs
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 492/492, 1.3 task/s, elapsed: 376s, ETA: 0s
2021-09-13 18:44:08,514 - mmdet - INFO - Evaluating bbox...
2021-09-13 18:44:08,518 - mmdet - ERROR - The testing results of the whole dataset is empty.
2021-09-13 18:44:08,533 - mmdet - INFO - Epoch(val) [1][492]
Note the first two iterations have loss statistics and then it vanishes again. Also the test continues to return "results of the whole dataset is empty". I saw in another issue that this problem could happen in case of box/polygons out of the image, then I checked my annotation again using the following code:
print("Loaded COCO annotation file with {} annotations".format(len(anns)))
failures = []
for ann in anns:
image = coco.loadImgs(ann['image_id'])[0]
h = image['height']
w = image['width']
for segm in ann['segmentation']:
for i in range(0, len(segm), 2):
#print(segm[i], segm[i+1], "-----", image['id'], h, w)
if (segm[i] < 0 or segm[i] > w) or (segm[i+1] < 0 or segm[i+1] > h):
print("Annotation out of image!")
print("\tAnnotation {}: {}\n\tImage {}: {}".format(ann['id'], ann, image['id'], image))
failures.append(ann['id'])
print("{} annotations with failures.".format(len(failures)))
Which returns:
Loaded COCO annotation file with 3265 annotations
0 annotations with failures.
So I believe this rules out the boundaries problem. Any of you got any insight?
The testing results of the whole dataset is empty.
Your model is not correct, so it outputs nothing.
You should find why your model cant be trained normally.
I found that loss_semantic_seg
is increasing during training.
I suggest you to train your model without semantic seg to ensure that your instance part of model is normal.
You are correct. But I made no changes to the model, except for class output. Simply changing the config from DetectoRS to other, Deformable DETR, trains correctly. With no change to data. Which points to the problem being something with DetectoRS.
I suggest you to train your model without semantic seg to ensure that your instance part of model is normal.
Is there a Config option for this?
Deformable DETR has not used the semantic head so it can be trained correctly. You can train your model with htc_without_semantic. If it is trained normally, you can try to train a htc model with semantic segmentation. If it can be trained normally too, there is somehing wrong with DetectoRS. If not, the semantic annotations might be not right, then you can check it.
Should changing my coco annotation file should be accompanied by a change in semantic annotation? If not, then I don't think it could be that, since I do not change the semantic segmentation annotation, as could be seen in the config.
I'm closing the issue for now, however. I was able to train using other models. As soon as I'm avaiable to test DetectoRS again I'll come back to this issue. Thank you again @AronLin, for your time and help.
Checklist
Describe the bug While using DetectoRS with a subset from COCO, I get an error on cv2.copyMakeBorder()
Reproduction
Throught Google Colab, using the MMDet Tutorial as base:
Changes to
CLASSES
value from CocoDataset config to reflect modified annotation. While note relevant to the code, I copied the Pad class from Pipelines to the Colab Notebook for faster debugging. This is visible in the traceback.A smaller subset from COCO using the same images with different class annotations
Environment
python mmdet/utils/collect_env.py
to collect necessary environment information and paste it here.$PATH
,$LD_LIBRARY_PATH
,$PYTHONPATH
, etc.) NoneError traceback If applicable, paste the error trackback here.
As I've seem in the previous cv2.copyMakeBorder errors posted here, this is likely due to data/annotations issue. However I can't locate it, since I'm using unmodified images from COCO and the COCO annotation with only the
category_id
changed. The padding is resulting in negative values, which break the assert, but I can't find where is this originating.