YimianDai / open-aff

code and trained models for "Attentional Feature Fusion"
729 stars 95 forks source link

代码中AFF模块中,为什么要用2×呢? #33

Open YoungLNB opened 2 years ago

YoungLNB commented 2 years ago

xo = 2 x wei + 2 residual (1 - wei)

Sunkey2333 commented 1 year ago

作者之前在其他 issue 回答过:

I believe it has no impact on the training. The reason I use the multiplication of 2 is that I want to keep the total weights the same as addition.

In the direct addition case, X + Y is actually 1 X + 1 Y, the sum of the weight is 2. However, in a soft selection way, M(X+Y) X + (1 - M(X+Y)) Y, the sum of the weight is 1, so I multiply 2 to keep them the same. Then the only difference between 1 X + 1 Y and 2 M(X+Y) X + 2 (1 - M(X+Y)) Y is the dynamic weight allocation, but the sum of the weights keeps the same.