Closed foralliance closed 5 years ago
1, we want to scale the values of different layer's output with bn on every single channel, so we do not append a scale layer after the bn layer. According to my experiments, there is no difference in the results. 2,If we do not supply the steps, the anchor layer will compute the steps according to the input image size and feature size automatically.
many thanks!!
In section 4.1.3 of the paper, Normalization is mentioned. In SSD, the L2 Normalization is used. In FSSD, the Batch Normaliztion operation is used. The purpose of the two methods is to normalize the scale. Is the effect of these two methods the same? What is the difference between these two methods?
In addition, in the code, the Batch Normaliztion operation is implemented as follows:
In many codes, Both the batchnorm and the Scale come together. why is there only a single batchnorm?
steps = []
. How to explain this? This parameter is empty, does not affect the design of the anchors?