Closed Atul997 closed 4 years ago
It is likely that your classification label is not well assigned on your custom data. Please check the cls label carefully.
It is likely that your classification label is not well assigned on your custom data. Please check the cls label carefully.
If it is so, then it should stop after first epoch, why it ran upto 4 epoch?
It is likely that your classification label is not well assigned on your custom data. Please check the cls label carefully.
If it is so, then it should stop after first epoch, why it ran upto 4 epoch?
If you are sure that you classification label is exactly right, I suggest to try:
weight_targets += 0.1
weight_targets[weight_targets > 1.0] = 1.0
in line180 of gfl_head.py.
By the way, I wonder if the validation performances for the first 3 epochs looks good. Can you provide the APs for the first 3 epochs?
I am training model on custom data and using config file
gfl_dcn_r101_ms2x.py
. After 4 epochs I am getting following error -File "tools/train.py", line 151, in <module> main() File "tools/train.py", line 147, in main meta=meta) File "/usr/local/lib/python3.6/dist-packages/mmdet-1.1.0+32863f2-py3.6-linux-x86_64.egg/mmdet/apis/train.py", line 165, in train_detector runner.run(data_loaders, cfg.workflow, cfg.total_epochs) File "/usr/local/lib/python3.6/dist-packages/mmcv/runner/runner.py", line 359, in run epoch_runner(data_loaders[i], **kwargs) File "/usr/local/lib/python3.6/dist-packages/mmcv/runner/runner.py", line 263, in train self.model, data_batch, train_mode=True, **kwargs) File "/usr/local/lib/python3.6/dist-packages/mmdet-1.1.0+32863f2-py3.6-linux-x86_64.egg/mmdet/apis/train.py", line 75, in batch_processor losses = model(**data) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in __call__ result = self.forward(*input, **kwargs) File "/usr/local/lib/python3.6/dist-packages/torch/nn/parallel/data_parallel.py", line 153, in forward return self.module(*inputs[0], **kwargs[0]) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in __call__ result = self.forward(*input, **kwargs) File "/usr/local/lib/python3.6/dist-packages/mmdet-1.1.0+32863f2-py3.6-linux-x86_64.egg/mmdet/core/fp16/decorators.py", line 49, in new_func return old_func(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/mmdet-1.1.0+32863f2-py3.6-linux-x86_64.egg/mmdet/models/detectors/base.py", line 147, in forward return self.forward_train(img, img_metas, **kwargs) File "/usr/local/lib/python3.6/dist-packages/mmdet-1.1.0+32863f2-py3.6-linux-x86_64.egg/mmdet/models/detectors/single_stage.py", line 71, in forward_train *loss_inputs, gt_bboxes_ignore=gt_bboxes_ignore) File "/usr/local/lib/python3.6/dist-packages/mmdet-1.1.0+32863f2-py3.6-linux-x86_64.egg/mmdet/core/fp16/decorators.py", line 127, in new_func return old_func(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/mmdet-1.1.0+32863f2-py3.6-linux-x86_64.egg/mmdet/models/anchor_heads/gfl_head.py", line 254, in loss cfg=cfg) File "/usr/local/lib/python3.6/dist-packages/mmdet-1.1.0+32863f2-py3.6-linux-x86_64.egg/mmdet/core/utils/misc.py", line 24, in multi_apply return tuple(map(list, zip(*map_results))) File "/usr/local/lib/python3.6/dist-packages/mmdet-1.1.0+32863f2-py3.6-linux-x86_64.egg/mmdet/models/anchor_heads/gfl_head.py", line 186, in loss_single avg_factor=1.0) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in __call__ result = self.forward(*input, **kwargs) File "/usr/local/lib/python3.6/dist-packages/mmdet-1.1.0+32863f2-py3.6-linux-x86_64.egg/mmdet/models/losses/iou_loss.py", line 200, in forward return (pred * weight).sum() # 0 RuntimeError: The size of tensor a (4) must match the size of tensor b (529) at non-singleton dimension 1
How to resolve this?