implus / GFocalV2

Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection, CVPR2021
Apache License 2.0
477 stars 54 forks source link

关于训练Negative Dataset #11

Open yanghgai opened 3 years ago

yanghgai commented 3 years ago

训练集中包含了一些负样本(没有检测目标)的图像, 为了训练时加入这些图像,我设置了 filter_empty_gt=False,如下: train=dict( type=dataset_type, ann_file=data_root + '/annotations/LumptrainCOCOmmdet.json', img_prefix=data_root+'/train', classes=classes, filter_empty_gt=False, pipeline=train_pipeline), 训练了几步后,报了以下错误: Consider using one of the following signatures instead: nonzero(*, bool as_tuple) (Triggered internally at /opt/conda/conda-bld/pytorch_1595629427478/work/torch/csrc/utils/python_arg_parser.cpp:766.) & (labels < bg_class_ind)).nonzero().squeeze(1) [W TensorIterator.cpp:924] Warning: Mixed memory format inputs detected while calling the operator. The operator will output channels_last tensor even if some of the inputs are not in channels_last format. (function operator()) 2021-01-26 16:05:28,372 - mmdet - INFO - Epoch [1][5/11040] lr: 8.992e-05, eta: 7 days, 18:36:26, time: 1.217, data_time: 0.843, memory: 1577, loss_cls: 0.0951, loss_bbox: 1.4986, loss_dfl: 0.7338, loss: 2.3275 2021-01-26 16:05:29,729 - mmdet - INFO - Epoch [1][10/11040] lr: 1.898e-04, eta: 4 days, 18:07:22, time: 0.272, data_time: 0.002, memory: 1577, loss_cls: 0.1357, loss_bbox: 1.4617, loss_dfl: 0.7207, loss: 2.3181 Traceback (most recent call last): File "./tools/train.py", line 179, in <module> main() File "./tools/train.py", line 175, in main meta=meta) File "/home2/yhg/GFocalV2-master/mmdet/apis/train.py", line 150, in train_detector runner.run(data_loaders, cfg.workflow, cfg.total_epochs) File "/home/hanwei-1/anaconda3/envs/mmdetection/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 125, in run epoch_runner(data_loaders[i], **kwargs) File "/home/hanwei-1/anaconda3/envs/mmdetection/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train self.run_iter(data_batch, train_mode=True) File "/home/hanwei-1/anaconda3/envs/mmdetection/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 30, in run_iter **kwargs) File "/home/hanwei-1/anaconda3/envs/mmdetection/lib/python3.7/site-packages/mmcv/parallel/distributed.py", line 36, in train_step output = self.module.train_step(*inputs[0], **kwargs[0]) File "/home2/yhg/GFocalV2-master/mmdet/models/detectors/base.py", line 234, in train_step losses = self(**data) File "/home/hanwei-1/anaconda3/envs/mmdetection/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(*input, **kwargs) File "/home/hanwei-1/anaconda3/envs/mmdetection/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 84, in new_func return old_func(*args, **kwargs) File "/home2/yhg/GFocalV2-master/mmdet/models/detectors/base.py", line 168, in forward return self.forward_train(img, img_metas, **kwargs) File "/home2/yhg/GFocalV2-master/mmdet/models/detectors/single_stage.py", line 94, in forward_train gt_labels, gt_bboxes_ignore) File "/home2/yhg/GFocalV2-master/mmdet/models/dense_heads/base_dense_head.py", line 54, in forward_train losses = self.loss(*loss_inputs, gt_bboxes_ignore=gt_bboxes_ignore) File "/home/hanwei-1/anaconda3/envs/mmdetection/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 164, in new_func return old_func(*args, **kwargs) File "/home2/yhg/GFocalV2-master/mmdet/models/dense_heads/gfocal_head.py", line 396, in loss avg_factor = reduce_mean(avg_factor).item() File "/home2/yhg/GFocalV2-master/mmdet/core/utils/dist_utils.py", line 68, in reduce_mean dist.all_reduce(tensor.div_(dist.get_world_size()), op=dist.ReduceOp.SUM) RuntimeError: Integer division of tensors using div or / is no longer supported, and in a future release div will perform true division as in Python 3. Use true_divide or floor_divide (// in Python) instead. 请问是什么原因?