Closed akshaykulkarni07 closed 3 years ago
Hi, this modified ASPP is borrowed from IJCAI (2020) and IJCV paper and is not the same as the modified ASPP in Deeplabv3 or Deeplabv3+. Also, as the ablation study in the Deeplabv3 paper, the most effective modification in ASPP is image-level features pooling with the multi-grid method. And their effective modification is not included in our code. So we think the modified ASPP in our code will not change the result.
Why we do not use the original ASPP in Deeplabv2? Because we need to calculate the prototype, this needs a feature point(dim:1xC) to represent a pixel in the image, that is to say, we need an FC layer or 1x1 Conv layer as a classifier. So we borrow this block from IJCAI (2020) and IJCV paper
And we can see the ablation study in our paper. The conventional self-training is trained by our modified ASPP, the performance is similar to the original ASPP in deeplabv2(45.9 mIoU reported in CRST).
Why not modify the out_channels=256
of Conv2d
in ASPP and add an extra Conv2d
with in_channels=256, out_channels=num_classes
to achieve this? I think this is the easiest way to get a feature point of each pixel in an image. Have you tried and proved useless of this way?
Please forgive me if this is a stupid question. 😀
class MultiOutASPP(nn.Module):
def __init__(self, inplanes, dilation_series=[6, 12, 18, 24], padding_series=[6, 12, 18, 24], outplanes=19):
super(MultiOutASPP, self).__init__()
self.conv2d_list = nn.ModuleList()
for dilation, padding in zip(dilation_series, padding_series):
self.conv2d_list.append(
nn.Conv2d(inplanes, 256, kernel_size=3, stride=1, padding=padding, dilation=dilation, bias=True))
self.classifier = nn.Conv2d(256, outplanes, kernel_size=1, padding=0, dilation=1, bias=True)
for m in self.conv2d_list:
m.weight.data.normal_(0, 0.01)
self.classifier.weight.data.normal_(0, 0.01)
def forward(self, x):
feat = self.conv2d_list[0](x)
for i in range(len(self.conv2d_list) - 1):
feat += self.conv2d_list[i + 1](x)
out = self.classifier(feat)
return {'feat': feat, 'out': out}
If we modify the out_channels=256 of Conv2d in ASPP, the capability of ASPP will smaller than standard ASPP(out_channels=1024). For a fair comparison with Seg_Uncertainty, we borrow this block from it.
I think it's better for you to report results with Deeplabv2(should not be difficult). The mainstream choice of segmentation is Deeplabv2, like SDCA, FADA paper.
Anything new? Is there anyone that has reproduced ProDA with minimum changes of ASPP?
Hi, congratulations on your great work and acceptance in CVPR '21. Thanks for releasing the code and model weights.
In the paper, you mention using DeepLabv2 with ResNet101 backbone. However, your code actually makes use of a modified ASPP module (
ClassifierModule2
inmodels/deeplabv2.py
) while actuallyClassifierModule
has to be used for DeepLabv2. Similar issues were raised here and here which mention that this type of ASPP module is used in DeepLabv3+ which has a much better performance compared to DeepLabv2 (both issues were raised in Jan. 2020). Could you please confirm this point and if you have also performed experiments with the original DeepLabv2 model, could you report those results for a fair comparison with prior arts?