the inference speed of ICNet gradually increases.

Howietzh commented 1 year ago

when runing the script below, I found the inference speed of icnet increased gradually.

CUDA_VISIBLE_DEVICES=0 python -u tools/benchmark.py configs/coarse_position/icnet_r18-d8_512x612_20k_coarseposition.py work_dirs/icnet_r50-d8_512x612_20k_coarseposition/latest.pth --repeat-times 5 ![Uploading image.png…]()

I have tried to increase the num_warmup to 1000 and total_iters to 1200, but the problem remains unsolved.

my config file is as below: norm_cfg = dict(type='SyncBN', requires_grad=True) model = dict( type='EncoderDecoder', backbone=dict( type='ICNet', backbone_cfg=dict( type='ResNetV1c', in_channels=3, depth=18, num_stages=4, out_indices=(0, 1, 2, 3), dilations=(1, 1, 2, 4), strides=(1, 2, 1, 1), norm_cfg=dict(type='SyncBN', requires_grad=True), norm_eval=False, style='pytorch', contract_dilation=True), in_channels=3, layer_channels=(128, 512), light_branch_middle_channels=32, psp_out_channels=512, out_channels=(64, 256, 256), norm_cfg=dict(type='SyncBN', requires_grad=True), align_corners=False), neck=dict( type='ICNeck', in_channels=(64, 256, 256), out_channels=128, norm_cfg=dict(type='SyncBN', requires_grad=True), align_corners=False), decode_head=dict( type='FCNHead', in_channels=128, channels=128, num_convs=1, in_index=2, dropout_ratio=0, num_classes=4, norm_cfg=dict(type='SyncBN', requires_grad=True), concat_input=False, align_corners=False, loss_decode=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)), auxiliary_head=[ dict( type='FCNHead', in_channels=128, channels=128, num_convs=1, num_classes=4, in_index=0, norm_cfg=dict(type='SyncBN', requires_grad=True), concat_input=False, align_corners=False, loss_decode=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=0.4)), dict( type='FCNHead', in_channels=128, channels=128, num_convs=1, num_classes=4, in_index=1, norm_cfg=dict(type='SyncBN', requires_grad=True), concat_input=False, align_corners=False, loss_decode=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=0.4)) ], train_cfg=dict(), test_cfg=dict(mode='whole')) dataset_type = 'CustomDataset' data_root = '../DataSets/CoarsePosition' classes = ('back_ground', 'headface', 'fpc', 'connector') palette = [[0, 0, 0], [128, 0, 0], [0, 128, 0], [128, 128, 0]] img_norm_cfg = dict( mean=[73.98013, 72.46433, 71.06376], std=[33.015854, 28.528011, 26.457438], to_rgb=True) crop_size = (612, 512) train_pipeline = [ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations'), dict(type='Resize', img_scale=(612, 512), ratio_range=(0.5, 2.0)), dict(type='RandomCrop', crop_size=(612, 512), cat_max_ratio=0.75), dict(type='RandomFlip', prob=0.5), dict(type='PhotoMetricDistortion'), dict( type='Normalize', mean=[73.98013, 72.46433, 71.06376], std=[33.015854, 28.528011, 26.457438], to_rgb=True), dict(type='Pad', size=(612, 512), pad_val=0, seg_pad_val=255), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_semantic_seg']) ] test_pipeline = [ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(612, 512), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[73.98013, 72.46433, 71.06376], std=[33.015854, 28.528011, 26.457438], to_rgb=True), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ] data = dict( samples_per_gpu=2, workers_per_gpu=2, train=dict( type='CustomDataset', data_root='../DataSets/CoarsePosition', classes=('back_ground', 'headface', 'fpc', 'connector'), palette=[[0, 0, 0], [128, 0, 0], [0, 128, 0], [128, 128, 0]], img_suffix='.png', img_dir='images/train', ann_dir='annotations/train', pipeline=[ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations'), dict(type='Resize', img_scale=(612, 512), ratio_range=(0.5, 2.0)), dict(type='RandomCrop', crop_size=(612, 512), cat_max_ratio=0.75), dict(type='RandomFlip', prob=0.5), dict(type='PhotoMetricDistortion'), dict( type='Normalize', mean=[73.98013, 72.46433, 71.06376], std=[33.015854, 28.528011, 26.457438], to_rgb=True), dict(type='Pad', size=(612, 512), pad_val=0, seg_pad_val=255), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_semantic_seg']) ]), val=dict( type='CustomDataset', data_root='../DataSets/CoarsePosition', classes=('back_ground', 'headface', 'fpc', 'connector'), palette=[[0, 0, 0], [128, 0, 0], [0, 128, 0], [128, 128, 0]], img_suffix='.png', img_dir='images/val', ann_dir='annotations/val', pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(612, 512), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[73.98013, 72.46433, 71.06376], std=[33.015854, 28.528011, 26.457438], to_rgb=True), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ]), test=dict( type='CustomDataset', data_root='../DataSets/CoarsePosition', classes=('back_ground', 'headface', 'fpc', 'connector'), palette=[[0, 0, 0], [128, 0, 0], [0, 128, 0], [128, 128, 0]], img_suffix='.png', img_dir='images/val', ann_dir='annotations/val', pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(612, 512), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[73.98013, 72.46433, 71.06376], std=[33.015854, 28.528011, 26.457438], to_rgb=True), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']) ]) ])) log_config = dict( interval=50, hooks=[dict(type='TextLoggerHook', by_epoch=False)]) dist_params = dict(backend='nccl') log_level = 'INFO' load_from = None resume_from = None workflow = [('train', 1)] cudnn_benchmark = True optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0005) optimizer_config = dict() lr_config = dict(policy='poly', power=0.9, min_lr=0.0001, by_epoch=False) runner = dict(type='IterBasedRunner', max_iters=20000) checkpoint_config = dict(by_epoch=False, interval=2000) evaluation = dict(interval=2000, metric='mIoU', pre_eval=True) work_dir = 'work_dirs/icnet_r50-d8_512x612_20k_coarseposition/' gpu_ids = range(0, 8) auto_resume = False

MeowZheng commented 1 year ago

How many times have you observed this phenomenon？Did you check the CPU usage when you ran the experiments

Howietzh commented 1 year ago

I run this script a lot of times. this phenomenon exists every time. I checked the CPU usage and its up to 273%. that can't be good, right? how to handle this problem? please! by the way, I only run the benchmark.py on the server and its cpu usage is up to 273%。

MeowZheng commented 1 year ago

It might be a little unreasonable.

If you run this script many times, and the speed is increased every time, i.e. the inference time drops every time. Will time drop to 0? This is clearly not possible. Therefore, the inference speed will be stable definitely.

I think the reason might be that your device. Please monitor the CPU/GPU usage of other tasks on your devices

epris commented 1 month ago

did you find the solution to your problem? could you share it?

open-mmlab / mmsegmentation

the inference speed of ICNet gradually increases. #2119