Closed maktu6 closed 4 years ago
Yes, I think so. Please see the updated comments of that file. For computing the residual coding, note that the grouping centers do not need to go through the \sum{q_ij d} / \sum{q_ij} step because \sum{q_ij d} = \sum{q_ij} * d.
For residual coding, we follow the commonly used nonlinear feature encoding scheme. You can definitely try to remove the residual term, it might hurt the accuracy to some extent. But I think you still need the normalization term to make the magnitude consistent.
Thanks for your reply!
Dose the implementation of Region Feature Extraction in here equal to Equation(3) in the paper, Can you explain it more concretely? And what is the difference of just using
qx = torch.bmm(assign, x)
as theout
, dose the model performance would degrade?