Open CRCGlobal opened 3 years ago
I've discovered part of the problem but my attempt at fixing it still results in an error. The backbone layers we need to draw from in the Small model are different than those in the Large model.
I made this change to the .yaml file:
in_channels: [24, 40, 96, 576]
See these values of Table 2 from the original publication and in mobilenetv3.py
for reference:
elif mode == 'small':
# refer to Table 2 in paper
mobile_setting = [
# k, exp, c, se, nl, s,
[3, 16, 16, True, 'RE', 2],
[3, 72, 24, False, 'RE', 2],
[3, 88, 24, False, 'RE', 1], ### 3
[5, 96, 40, True, 'HS', 2],
[5, 240, 40, True, 'HS', 1],
[5, 240, 40, True, 'HS', 1], ### 6
[5, 120, 48, True, 'HS', 1],
[5, 144, 48, True, 'HS', 1],
[5, 288, 96, True, 'HS', 2], ### 9
[5, 576, 96, True, 'HS', 1],
[5, 576, 96, True, 'HS', 1],
]
I then saved the mode ('small' or 'large') as a MobileNetV3
instance attribute, and used that in .forward()
to identify the backbone output layers differently for the Small mobile and Large model.
def forward(self, x):
'''x = self.features(x)
x = x.mean(3).mean(2)
x = self.classifier(x)
return x'''
if self.mode=='large':
x2, x3, x4, x5 = None, None, None, None
for stage in range(17): # https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.1/ppocr/modeling/backbones/det_mobilenet_v3.py
x = self.features[stage](x)
if stage == 3: # if s == 2 and start_idx > 3
x2 = x
elif stage == 6:
x3 = x
elif stage == 12:
x4 = x
elif stage == 16:
x5 = x
return x2, x3, x4, x5
elif self.mode=='small':
x2, x3, x4, x5 = None, None, None, None
for stage in range(13): # https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.1/ppocr/modeling/backbones/det_mobilenet_v3.py
x = self.features[stage](x)
if stage == 3: # if s == 2 and start_idx > 3
x2 = x
elif stage == 6:
x3 = x
elif stage == 9:
x4 = x
elif stage == 12:
x5 = x
return x2, x3, x4, x5
else:
raise NotImplementedError
But now I get an error indicating one the layers has 2x more channels than expected at an upsample and sum command.
$ CUDA_VISIBLE_DEVICES=0 python train.py experiments/seg_detector/ic15_mobilenet_v3_small_thre.yaml --num_gpus 1
[INFO] [2021-06-10 17:29:43,453] Training epoch 0
Traceback (most recent call last):
File "train.py", line 70, in <module>
main()
File "train.py", line 67, in main
trainer.train()
File "/home/mroos/Code/gatekeeper_differentiable_binarization/trainer.py", line 86, in train
epoch=epoch, step=self.steps)
File "/home/mroos/Code/gatekeeper_differentiable_binarization/trainer.py", line 109, in train_step
results = model.forward(batch, training=True)
File "/home/mroos/Code/gatekeeper_differentiable_binarization/structure/model.py", line 56, in forward
pred = self.model(data, training=self.training)
File "/home/mroos/python_envs/env_torch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/mroos/python_envs/env_torch/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 159, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/mroos/python_envs/env_torch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/mroos/Code/gatekeeper_differentiable_binarization/structure/model.py", line 19, in forward
return self.decoder(self.backbone(data), *args, **kwargs)
File "/home/mroos/python_envs/env_torch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/mroos/Code/gatekeeper_differentiable_binarization/decoders/seg_detector.py", line 124, in forward
out4 = self.up5(in5) + in4 # 1/16
RuntimeError: The size of tensor a (40) must match the size of tensor b (20) at non-singleton dimension 3
I've discovered part of the problem but my attempt at fixing it still results in an error. The backbone layers we need to draw from in the Small model are different than those in the Large model.
I made this change to the .yaml file:
in_channels: [24, 40, 96, 576]
See these values of Table 2 from the original publication and in
mobilenetv3.py
for reference:elif mode == 'small': # refer to Table 2 in paper mobile_setting = [ # k, exp, c, se, nl, s, [3, 16, 16, True, 'RE', 2], [3, 72, 24, False, 'RE', 2], [3, 88, 24, False, 'RE', 1], ### 3 [5, 96, 40, True, 'HS', 2], [5, 240, 40, True, 'HS', 1], [5, 240, 40, True, 'HS', 1], ### 6 [5, 120, 48, True, 'HS', 1], [5, 144, 48, True, 'HS', 1], [5, 288, 96, True, 'HS', 2], ### 9 [5, 576, 96, True, 'HS', 1], [5, 576, 96, True, 'HS', 1], ]
I then saved the mode ('small' or 'large') as a
MobileNetV3
instance attribute, and used that in.forward()
to identify the backbone output layers differently for the Small mobile and Large model.def forward(self, x): '''x = self.features(x) x = x.mean(3).mean(2) x = self.classifier(x) return x''' if self.mode=='large': x2, x3, x4, x5 = None, None, None, None for stage in range(17): # https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.1/ppocr/modeling/backbones/det_mobilenet_v3.py x = self.features[stage](x) if stage == 3: # if s == 2 and start_idx > 3 x2 = x elif stage == 6: x3 = x elif stage == 12: x4 = x elif stage == 16: x5 = x return x2, x3, x4, x5 elif self.mode=='small': x2, x3, x4, x5 = None, None, None, None for stage in range(13): # https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.1/ppocr/modeling/backbones/det_mobilenet_v3.py x = self.features[stage](x) if stage == 3: # if s == 2 and start_idx > 3 x2 = x elif stage == 6: x3 = x elif stage == 9: x4 = x elif stage == 12: x5 = x return x2, x3, x4, x5 else: raise NotImplementedError
But now I get an error indicating one the layers has 2x more channels than expected at an upsample and sum command.
$ CUDA_VISIBLE_DEVICES=0 python train.py experiments/seg_detector/ic15_mobilenet_v3_small_thre.yaml --num_gpus 1 [INFO] [2021-06-10 17:29:43,453] Training epoch 0 Traceback (most recent call last): File "train.py", line 70, in <module> main() File "train.py", line 67, in main trainer.train() File "/home/mroos/Code/gatekeeper_differentiable_binarization/trainer.py", line 86, in train epoch=epoch, step=self.steps) File "/home/mroos/Code/gatekeeper_differentiable_binarization/trainer.py", line 109, in train_step results = model.forward(batch, training=True) File "/home/mroos/Code/gatekeeper_differentiable_binarization/structure/model.py", line 56, in forward pred = self.model(data, training=self.training) File "/home/mroos/python_envs/env_torch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/mroos/python_envs/env_torch/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 159, in forward return self.module(*inputs[0], **kwargs[0]) File "/home/mroos/python_envs/env_torch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/mroos/Code/gatekeeper_differentiable_binarization/structure/model.py", line 19, in forward return self.decoder(self.backbone(data), *args, **kwargs) File "/home/mroos/python_envs/env_torch/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "/home/mroos/Code/gatekeeper_differentiable_binarization/decoders/seg_detector.py", line 124, in forward out4 = self.up5(in5) + in4 # 1/16 RuntimeError: The size of tensor a (40) must match the size of tensor b (20) at non-singleton dimension 3
large is downsample 8 after step3, however samll is 4, you need fix "mobile_setting" or seg_decoder "forward"
I've successfully trained a MobileNetv3-Large backbone on ICDAR 2015. (See here for results.) However, I get the error below when trying to train a model with a MobileNetv3-Small backbone. @Microkitty, any suggestions?
Training command and resulting error:
This is my .yaml file: