Closed diplomatist closed 1 year ago
@diplomatist Thank you for your feedback. I'll check it
when I use mmdet.DistributionFocalLoss or mmdet.GaussianFocalLoss ,bug is
# mmdet.DistributionFocalLoss or mmdet.GaussianFocalLoss bug
2/05 23:06:20 - mmengine - INFO - Result has been saved to /home/xux/CaiLiYuan/Project/mmyolo/work_dirs/yolov7_x_Dfl/modules_statistic_results.json
Traceback (most recent call last):
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/registry/build_functions.py", line 121, in build_from_cfg
obj = obj_cls(**args) # type: ignore
TypeError: __init__() got an unexpected keyword argument 'use_sigmoid'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/registry/build_functions.py", line 121, in build_from_cfg
obj = obj_cls(**args) # type: ignore
File "/home/xux/CaiLiYuan/Project/mmyolo/mmyolo/models/dense_heads/yolov7_head.py", line 192, in __init__
super().__init__(*args, **kwargs)
File "/home/xux/CaiLiYuan/Project/mmyolo/mmyolo/models/dense_heads/yolov5_head.py", line 185, in __init__
self.loss_cls: nn.Module = MODELS.build(loss_cls)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/registry/registry.py", line 454, in build
return self.build_func(cfg, *args, **kwargs, registry=self)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/registry/build_functions.py", line 240, in build_model_from_cfg
return build_from_cfg(cfg, registry, default_args)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/registry/build_functions.py", line 136, in build_from_cfg
f'class `{obj_cls.__name__}` in ' # type: ignore
TypeError: class `DistributionFocalLoss` in mmdet/models/losses/gfocal_loss.py: __init__() got an unexpected keyword argument 'use_sigmoid'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/registry/build_functions.py", line 121, in build_from_cfg
obj = obj_cls(**args) # type: ignore
File "/home/xux/CaiLiYuan/Project/mmyolo/mmyolo/models/detectors/yolo_detector.py", line 48, in __init__
init_cfg=init_cfg)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmdet/models/detectors/single_stage.py", line 35, in __init__
self.bbox_head = MODELS.build(bbox_head)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/registry/registry.py", line 454, in build
return self.build_func(cfg, *args, **kwargs, registry=self)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/registry/build_functions.py", line 240, in build_model_from_cfg
return build_from_cfg(cfg, registry, default_args)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/registry/build_functions.py", line 136, in build_from_cfg
f'class `{obj_cls.__name__}` in ' # type: ignore
TypeError: class `YOLOv7Head` in mmyolo/models/dense_heads/yolov7_head.py: class `DistributionFocalLoss` in mmdet/models/losses/gfocal_loss.py: __init__() got an unexpected keyword argument 'use_sigmoid'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "./tools/train.py", line 106, in <module>
main()
File "./tools/train.py", line 95, in main
runner = Runner.from_cfg(cfg)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/runner/runner.py", line 464, in from_cfg
cfg=cfg,
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/runner/runner.py", line 404, in __init__
self.model = self.build_model(model)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/runner/runner.py", line 806, in build_model
model = MODELS.build(model)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/registry/registry.py", line 454, in build
return self.build_func(cfg, *args, **kwargs, registry=self)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/registry/build_functions.py", line 240, in build_model_from_cfg
return build_from_cfg(cfg, registry, default_args)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/registry/build_functions.py", line 136, in build_from_cfg
f'class `{obj_cls.__name__}` in ' # type: ignore
TypeError: class `YOLODetector` in mmyolo/models/detectors/yolo_detector.py: class `YOLOv7Head` in mmyolo/models/dense_heads/yolov7_head.py: class `DistributionFocalLoss` in mmdet/models/losses/gfocal_loss.py: __init__() got an unexpected keyword argument 'use_sigmoid'
Traceback (most recent call last):
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/registry/build_functions.py", line 121, in build_from_cfg
obj = obj_cls(**args) # type: ignore
TypeError: __init__() got an unexpected keyword argument 'use_sigmoid'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/registry/build_functions.py", line 121, in build_from_cfg
obj = obj_cls(**args) # type: ignore
File "/home/xux/CaiLiYuan/Project/mmyolo/mmyolo/models/dense_heads/yolov7_head.py", line 192, in __init__
super().__init__(*args, **kwargs)
File "/home/xux/CaiLiYuan/Project/mmyolo/mmyolo/models/dense_heads/yolov5_head.py", line 185, in __init__
self.loss_cls: nn.Module = MODELS.build(loss_cls)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/registry/registry.py", line 454, in build
return self.build_func(cfg, *args, **kwargs, registry=self)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/registry/build_functions.py", line 240, in build_model_from_cfg
return build_from_cfg(cfg, registry, default_args)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/registry/build_functions.py", line 136, in build_from_cfg
f'class `{obj_cls.__name__}` in ' # type: ignore
TypeError: class `DistributionFocalLoss` in mmdet/models/losses/gfocal_loss.py: __init__() got an unexpected keyword argument 'use_sigmoid'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/registry/build_functions.py", line 121, in build_from_cfg
obj = obj_cls(**args) # type: ignore
File "/home/xux/CaiLiYuan/Project/mmyolo/mmyolo/models/detectors/yolo_detector.py", line 48, in __init__
init_cfg=init_cfg)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmdet/models/detectors/single_stage.py", line 35, in __init__
self.bbox_head = MODELS.build(bbox_head)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/registry/registry.py", line 454, in build
return self.build_func(cfg, *args, **kwargs, registry=self)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/registry/build_functions.py", line 240, in build_model_from_cfg
return build_from_cfg(cfg, registry, default_args)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/registry/build_functions.py", line 136, in build_from_cfg
f'class `{obj_cls.__name__}` in ' # type: ignore
TypeError: class `YOLOv7Head` in mmyolo/models/dense_heads/yolov7_head.py: class `DistributionFocalLoss` in mmdet/models/losses/gfocal_loss.py: __init__() got an unexpected keyword argument 'use_sigmoid'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "./tools/train.py", line 106, in <module>
main()
File "./tools/train.py", line 95, in main
runner = Runner.from_cfg(cfg)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/runner/runner.py", line 464, in from_cfg
cfg=cfg,
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/runner/runner.py", line 404, in __init__
self.model = self.build_model(model)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/runner/runner.py", line 806, in build_model
model = MODELS.build(model)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/registry/registry.py", line 454, in build
return self.build_func(cfg, *args, **kwargs, registry=self)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/registry/build_functions.py", line 240, in build_model_from_cfg
return build_from_cfg(cfg, registry, default_args)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/registry/build_functions.py", line 136, in build_from_cfg
f'class `{obj_cls.__name__}` in ' # type: ignore
TypeError: class `YOLODetector` in mmyolo/models/detectors/yolo_detector.py: class `YOLOv7Head` in mmyolo/models/dense_heads/yolov7_head.py: class `DistributionFocalLoss` in mmdet/models/losses/gfocal_loss.py: __init__() got an unexpected keyword argument 'use_sigmoid'
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 24137) of binary: /home/xux/anaconda3/envs/zzza_py36/bin/python
Traceback (most recent call last):
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/torch/distributed/launch.py", line 193, in <module>
main()
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/torch/distributed/launch.py", line 189, in main
launch(args)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/torch/distributed/launch.py", line 174, in launch
run(args)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/torch/distributed/run.py", line 713, in run
)(*cmd_args)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/torch/distributed/launcher/api.py", line 131, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/torch/distributed/launcher/api.py", line 261, in launch_agent
failures=result.failures,
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
./tools/train.py FAILED
------------------------------------------------------------
when I use mmdet.QualityFocalLoss,bug is:
# mmdet.QualityFocalLoss bug
Traceback (most recent call last):
File "./tools/train.py", line 106, in <module>
main()
File "./tools/train.py", line 102, in main
runner.train()
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/runner/runner.py", line 1684, in train
Traceback (most recent call last):
File "./tools/train.py", line 106, in <module>
main()
File "./tools/train.py", line 102, in main
runner.train()
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/runner/runner.py", line 1684, in train
model = self.train_loop.run() # type: ignore
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/runner/loops.py", line 90, in run
self.run_epoch()
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/runner/loops.py", line 106, in run_epoch
model = self.train_loop.run() # type: ignore
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/runner/loops.py", line 90, in run
self.run_epoch()
self.run_iter(idx, data_batch)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/runner/loops.py", line 106, in run_epoch
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/runner/loops.py", line 123, in run_iter
self.run_iter(idx, data_batch)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/runner/loops.py", line 123, in run_iter
data_batch, optim_wrapper=self.runner.optim_wrapper)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/model/wrappers/distributed.py", line 121, in train_step
losses = self._run_forward(data, mode='loss')
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/model/wrappers/distributed.py", line 161, in _run_forward
results = self(**data, mode=mode)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
data_batch, optim_wrapper=self.runner.optim_wrapper)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/model/wrappers/distributed.py", line 121, in train_step
losses = self._run_forward(data, mode='loss')
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmengine/model/wrappers/distributed.py", line 161, in _run_forward
results = self(**data, mode=mode)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/torch/nn/parallel/distributed.py", line 886, in forward
return forward_call(*input, **kwargs)
output = self.module(*inputs[0], **kwargs[0])
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/torch/nn/parallel/distributed.py", line 886, in forward
output = self.module(*inputs[0], **kwargs[0])
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmdet/models/detectors/base.py", line 92, in forward
return self.loss(inputs, data_samples)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmdet/models/detectors/single_stage.py", line 78, in loss
losses = self.bbox_head.loss(x, batch_data_samples)
File "/home/xux/CaiLiYuan/Project/mmyolo/mmyolo/models/dense_heads/yolov5_head.py", line 450, in loss
return forward_call(*input, **kwargs)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmdet/models/detectors/base.py", line 92, in forward
return self.loss(inputs, data_samples)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmdet/models/detectors/single_stage.py", line 78, in loss
losses = self.loss_by_feat(*loss_inputs)
File "/home/xux/CaiLiYuan/Project/mmyolo/mmyolo/models/dense_heads/yolov7_head.py", line 278, in loss_by_feat
losses = self.bbox_head.loss(x, batch_data_samples)
File "/home/xux/CaiLiYuan/Project/mmyolo/mmyolo/models/dense_heads/yolov5_head.py", line 450, in loss
device=device)
File "/home/xux/CaiLiYuan/Project/mmyolo/mmyolo/models/dense_heads/yolov7_head.py", line 353, in _calc_loss
losses = self.loss_by_feat(*loss_inputs)
target_obj) * self.obj_level_weights[i]
File "/home/xux/CaiLiYuan/Project/mmyolo/mmyolo/models/dense_heads/yolov7_head.py", line 278, in loss_by_feat
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
device=device)
File "/home/xux/CaiLiYuan/Project/mmyolo/mmyolo/models/dense_heads/yolov7_head.py", line 353, in _calc_loss
target_obj) * self.obj_level_weights[i]
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmdet/models/losses/gfocal_loss.py", line 193, in forward
avg_factor=avg_factor)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmdet/models/losses/utils.py", line 99, in wrapper
loss = loss_func(pred, target, **kwargs)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmdet/models/losses/gfocal_loss.py", line 28, in quality_focal_loss
including category label and quality label, respectively"""
return forward_call(*input, **kwargs)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmdet/models/losses/gfocal_loss.py", line 193, in forward
AssertionError: target for QFL must be a tuple of two elements,
including category label and quality label, respectively
avg_factor=avg_factor)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmdet/models/losses/utils.py", line 99, in wrapper
loss = loss_func(pred, target, **kwargs)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/mmdet/models/losses/gfocal_loss.py", line 28, in quality_focal_loss
including category label and quality label, respectively"""
AssertionError: target for QFL must be a tuple of two elements,
including category label and quality label, respectively
terminate called after throwing an instance of 'c10::CUDAError'
what(): CUDA error: driver shutting down
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Exception raised from query at ../aten/src/ATen/cuda/CUDAEvent.h:95 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x42 (0x7f1524215d62 in /home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: c10d::ProcessGroupNCCL::WorkNCCL::finishedGPUExecutionInternal() const + 0x11a (0x7f15814e79ba in /home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/torch/lib/libtorch_cuda_cpp.so)
frame #2: c10d::ProcessGroupNCCL::WorkNCCL::isCompleted() + 0x50 (0x7f15814e9cb0 in /home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/torch/lib/libtorch_cuda_cpp.so)
frame #3: c10d::ProcessGroupNCCL::workCleanupLoop() + 0x11c (0x7f15814ea77c in /home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/torch/lib/libtorch_cuda_cpp.so)
frame #4: <unknown function> + 0xbd6df (0x7f15ec2b46df in /usr/lib/x86_64-linux-gnu/libstdc++.so.6)
frame #5: <unknown function> + 0x76db (0x7f15f390c6db in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #6: clone + 0x3f (0x7f15f363561f in /lib/x86_64-linux-gnu/libc.so.6)
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -6) local_rank: 0 (pid: 24553) of binary: /home/xux/anaconda3/envs/zzza_py36/bin/python
Traceback (most recent call last):
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/torch/distributed/launch.py", line 193, in <module>
main()
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/torch/distributed/launch.py", line 189, in main
launch(args)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/torch/distributed/launch.py", line 174, in launch
run(args)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/torch/distributed/run.py", line 713, in run
)(*cmd_args)
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/torch/distributed/launcher/api.py", line 131, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/xux/anaconda3/envs/zzza_py36/lib/python3.6/site-packages/torch/distributed/launcher/api.py", line 261, in launch_agent
failures=result.failures,
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
./tools/train.py FAILED
------------------------------------------------------------
@hhaAndroid
@diplomatist We're working on this, wait a minute.
Prerequisite
🐞 Describe the bug
我将yolov7的head的loss_cls和loss_obj修改为type='mmdet.FocalLoss'时,出现了bug:assert input.dim()==2 AssertionError
Environment
Additional information
1.我仅修改了configs/yolov7/yolov7_l_syncbn_fast_8x16b-300e_coco.py中model->bbox_head->loss_cls,loss_obj->type='mmdet.FocalLoss',修改前是能正常训练的. 2.我用的是自己的数据集,并将其转为coco格式,已能正常训练及验证.