hujie-frank / SENet

Squeeze-and-Excitation Networks
Apache License 2.0
3.38k stars 837 forks source link

the scale value will all regression to 0.65 #78

Closed ArtyZe closed 5 years ago

ArtyZe commented 5 years ago

hello, @hujie-frank ,thanks for your great job. But when I use se module after a normal conv layer, all the scale value regress to 0.65, are not different so obviously like in your paper, do you have any ideas? Thanks

lishen-shirley commented 5 years ago

Could you give detailed descriptions for your problem, e.g., network configuration, task, input?

ArtyZe commented 5 years ago

Thanks. Now I have solved the problem, but I have still some questions:

  1. The two FC layers need biases or not?
  2. Must I index se module after two conv layers like in the paper?
  3. Have you try to use se module in a normal conv layer, not in res module?
  4. At last, most of my sigmoid outputs are 0 and 1, not like in the paper, range in 0-1, is it ok? Best regards
lishen-shirley commented 5 years ago

Besides residual architectures, we also tested it on the non-residual backbones in the paper. We presented one solution while the blocks can be implemented in a flexible manner (e.g., in the extension https://arxiv.org/pdf/1709.01507.pdf). Besides with/without biases, you can also try to add BatchNorm at the fc layers which we found work well on several backbone architectures.

eslambakr commented 4 years ago

Hello,

I am facing the same issue, most of the scales are 0.5, how could you solve it @ArtyZe ? The halves mean the output is zeros so the sigmoid ouput haves which is wired, @lishen-shirley @hujie-frank Do you suggest any solutions or tricks?

I am integrating the SE-block on Resnet50.

Thanks in advance