I replaced the single-branch network of a low-level vision task network with the multi-branch in the paper, but instead the training did not converge. I did not add the SE module, is this the reason?
We currently just replace the 3x3 convolution in the residual block with a multi-branch structure, and then the training does not converge. Have you tried applying the reparameterization technique to resnet?
I replaced the single-branch network of a low-level vision task network with the multi-branch in the paper, but instead the training did not converge. I did not add the SE module, is this the reason?