Closed murdockhou closed 5 years ago
Hi. Since the ttfnet will directly predict the box size on the original scale, the predictions may vary greatly (e.g., For the 512x512 input image, the predictions may vary from 10 to 200+), which may be harmful for training.
In order to avoid directly predicting large values, we introduce s
here, as you said, to make the model predict smaller values. This helps to converge faster in the early stages of training. The s
is set to 16
, and it is not carefully selected.
@liuzili97 , thanks for your reply, i figured out, thanks.
你好,感谢你的工作!关于
Gaussian Kernels for Training, Size Regression
部分,不是很明白里面的scalars
的作用,加上s后不是网络预测的w_l, h_l, w_r, h_r
值反而会更小更不利于网络预测吗?