I see "x = normalize(x, [0.485, 0.456, 0.406], [0.229, 0.224, 0.225])" in mobilenetv3.py and
'''
f1, f2, f3, f4 = self.backbone(src_sm)
...
hid, *rec = self.decoder(src_sm, f1, f2, f3, f4, r1, r2, r3, r4)
''' in model.py.
This means the input src_sm of the decoder has not been normalized. Is that your intention?
I see "x = normalize(x, [0.485, 0.456, 0.406], [0.229, 0.224, 0.225])" in mobilenetv3.py and ''' f1, f2, f3, f4 = self.backbone(src_sm) ... hid, *rec = self.decoder(src_sm, f1, f2, f3, f4, r1, r2, r3, r4) ''' in model.py. This means the input src_sm of the decoder has not been normalized. Is that your intention?