Open jigongbao opened 1 year ago
I can confirm the issue still exists on MMDetection 3.0 + PyTorch 2.0
the minimum code snippet with default config to reproduce the error
python tools/train.py configs/tood/tood_r50_fpn_1x_coco.py --auto-scale-lr --amp
The error I got:
RuntimeError: torch.nn.functional.binary_cross_entropy and torch.nn.BCELoss are unsafe to autocast. Many models use a sigmoid layer right before the binary cross entropy layer. In this case, combine the two layers using torch.nn.functional.binary_cross_entropy_with_logits or torch.nn.BCEWithLogitsLoss. binary_cross_entropy_with_logits and BCEWithLogits are safe to autocast.
Is AMP/FP16 still not supported on TOOD as mentioned in #7113 ?
Describe the bug When I train TOOD with amp I get an error, when I remove the amp it works fine.
Reproduction
What command or script did you run? I use “CUDA_VISIBLE_DEVICES=2,3,4,5 bash ./tools/dist_train.sh configs/tood/tood_r50_fpn_1x_vis.py 4 ”
Did you make any modifications on the code or config? Did you understand what you have modified? Yes,I modified the output of FPN to output only P2 to P5, and also modified the optimizer to replace SGD with Adamw. Here is my config. base = [ '../base/datasets/visdrone_detection.py', '../base/schedules/schedule_vis.py', '../base/default_runtime.py' ]
model settings
model = dict( type='TOOD', data_preprocessor=dict( type='DetDataPreprocessor', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], bgr_to_rgb=True, pad_size_divisor=32), backbone=dict( type='ResNet', depth=50, num_stages=4, out_indices=(0, 1, 2, 3), frozen_stages=1, norm_cfg=dict(type='BN', requires_grad=True), norm_eval=True, style='pytorch', init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50')), neck=dict( type='FPN', in_channels=[256, 512, 1024, 2048], out_channels=256, start_level=0, add_extra_convs='on_output', num_outs=4), bbox_head=dict( type='TOODHead', num_classes=10, in_channels=256, stacked_convs=6, feat_channels=256, anchor_type='anchor_free', anchor_generator=dict( type='AnchorGenerator', ratios=[1.0], octave_base_scale=8, scales_per_octave=1, strides=[4, 8, 16, 32]), bbox_coder=dict( type='DeltaXYWHBBoxCoder', target_means=[.0, .0, .0, .0], target_stds=[0.1, 0.1, 0.2, 0.2]), initial_loss_cls=dict( type='FocalLoss', use_sigmoid=True, activated=True, # use probability instead of logit as input gamma=2.0, alpha=0.25, loss_weight=1.0), loss_cls=dict( type='QualityFocalLoss', use_sigmoid=True, activated=True, # use probability instead of logit as input beta=2.0, loss_weight=1.0), loss_bbox=dict(type='GIoULoss', loss_weight=2.0)), train_cfg=dict( initial_epoch=4, initial_assigner=dict(type='ATSSAssigner', topk=9), assigner=dict(type='TaskAlignedAssigner', topk=13), alpha=1, beta=6, allowed_border=-1, pos_weight=-1, debug=False), test_cfg=dict( nms_pre=1500, min_bbox_size=0, score_thr=0.05, nms=dict(type='nms', iou_threshold=0.6), max_per_img=500))
optimizer
optim_wrapper = dict( type='OptimWrapper', paramwise_cfg=dict( custom_keys={ 'absolute_pos_embed': dict(decay_mult=0.), 'relative_position_bias_table': dict(decay_mult=0.), 'norm': dict(decay_mult=0.) }), optimizer=dict( delete=True, type='AdamW', lr=0.005, betas=(0.9, 0.999), weight_decay=0.05))
VisDrone.
python mmdet/utils/collect_env.py
to collect necessary environment information and paste it here.sys.platform: linux Python: 3.8.13 (default, Mar 28 2022, 11:38:47) [GCC 7.5.0] CUDA available: True numpy_random_seed: 2147483648 GPU 0,1,2,3,4,5,6,7,8,9: NVIDIA GeForce RTX 2080 Ti CUDA_HOME: None GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 PyTorch: 1.12.1 PyTorch compiling details: PyTorch built with:
TorchVision: 0.13.1 OpenCV: 4.7.0 MMEngine: 0.7.2 MMDetection: 3.0.0+unknown