I am trying to train my own dataset with two classes ( background and foreground ). I get confused of the output from seg net:
class MobileNetV3Seg(BaseModel):
def __init__(self, nclass, aux=False, backbone='mobilenetv3_small', pretrained_base=False, **kwargs):
super(MobileNetV3Seg, self).__init__(nclass, aux, backbone, pretrained_base, **kwargs)
mode = backbone.split('_')[-1]
self.head = _Head(nclass, mode, **kwargs)
if aux:
inter_channels = 40 if mode == 'large' else 24
self.auxlayer = nn.Conv2d(inter_channels, nclass, 1)
def forward(self, x):
size = x.size()[2:]
_, c2, _, c4 = self.base_forward(x)
outputs = list()
x = self.head(c4)
x = F.interpolate(x, size, mode='bilinear', align_corners=True)
outputs.append(x) # Why output is a list which only append one elemenet?
if self.aux: # What's the aux for?
auxout = self.auxlayer(c2)
auxout = F.interpolate(auxout, size, mode='bilinear', align_corners=True)
outputs.append(auxout)
return tuple(outputs)
Furthermore I wonder the form of segmentation ground truth. My dataset comes with ground truth of dimension (Height, Width, 1) and last aixs sets 0 for background and 1 for foreground. But as my observation of your code the ground truth of cityscapes seems split into NUMCLASS ( 19 for cityscapes ) channels, something like (Height, Width, NUMCLASS ). So I wonder how to adapt the segmentation ground truth of my dataset to such kind of form.
I am trying to train my own dataset with two classes ( background and foreground ). I get confused of the output from seg net:
Furthermore I wonder the form of segmentation ground truth. My dataset comes with ground truth of dimension (Height, Width, 1) and last aixs sets 0 for background and 1 for foreground. But as my observation of your code the ground truth of cityscapes seems split into NUMCLASS ( 19 for cityscapes ) channels, something like (Height, Width, NUMCLASS ). So I wonder how to adapt the segmentation ground truth of my dataset to such kind of form.