Open wwjwy opened 2 years ago
Might give us more clear comparison? like what are the 'other models'
?
14.5fps
15.36fps
The amount of calculation and parameters of deeplanv3plus are smaller, and the inference speed is not very faster?
@wwjwy can you share your config file that swaps the backbone to ConvNext?
@wwjwy can you share your config file that swaps the backbone to ConvNext?
model = dict( type='EncoderDecoder', pretrained=None, backbone=dict( type='mmcls.ConvNeXt', arch='small', out_indices=[0, 1, 2, 3], drop_path_rate=0.3, layer_scale_init_value=1.0, gap_before_final_norm=False, init_cfg=dict( type='Pretrained', checkpoint= 'https://download.openmmlab.com/mmclassification/v0/convnext/downstream/convnext-small_3rdparty_32xb128-noema_in1k_20220301-303e75e3.pth', prefix='backbone.')), decode_head=dict( type='DepthwiseSeparableASPPHead', in_channels=768, in_index=3, channels=512, dilations=(1, 12, 24, 36), c1_in_channels=96, c1_channels=48, dropout_ratio=0.1, num_classes=5, norm_cfg=dict(type='SyncBN', requires_grad=True), align_corners=False, loss_decode=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)), auxiliary_head=dict( type='FCNHead', in_channels=384, in_index=2, channels=256, num_convs=1, concat_input=False, dropout_ratio=0.1, num_classes=5, norm_cfg=dict(type='SyncBN', requires_grad=True), align_corners=False, loss_decode=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=0.4)), train_cfg=dict(), test_cfg=dict(mode='whole'))
@wwjwy I think there is some fundamental problem with the way ConvNeXt is providing the feature map output.
from mmcls.models import ConvNeXt
import torch
self = ConvNeXt(arch="small", out_indices=(0, 1, 2, 3))
self.eval()
inputs = torch.rand(1, 3, 1024, 1024)
level_outputs = self.forward(inputs)
for level_out in level_outputs:
print(tuple(level_out.shape))
>>>> (1, 96)
>>>> (1, 192)
>>>> (1, 384)
>>>> (1, 768)
As you can see, the feature map is gone (height, width dimension). Not too sure why is this the case with mmcls for convnext.
Edit: saw that you already put gap_before_final_norm=False
, that fixes this issue but as per commented by the authors:
# The output of LayerNorm2d may be discontiguous, which
# may cause some problem in the downstream tasks
HI, I have a confusion hoping to get help. Why is the inference speed of Deeplabv3plus much lower than other models with similar parameters and computational complexity(GFLOPs)?