Closed wml666666 closed 1 month ago
Hello, this operation can scale up attention values. With the subsequent softmax function, it leads to a sharper focus on the key elements that the attention mechanism is trying to highlight. However in our experiments, this operation didn't make a significant difference.
Okay, I understand. Thank you for your reply!
Hello, sorry to bother you. Formula (2) in your paper mentions Ecls (class embedding), but when I studied your code, I did not find this variable in ffa.py.
In our experiments, feature queries are class-agnostic initially. It is reasonable to assign a class embedding to each category for discrimination. However, in the final experiment, each category has its exclusive feature queries, making the class embedding redundant. Therefore, we removed it to simplify our code. Sorry for the confusion and hope it can clarifies this issue.
Okay, thank you for your reply.
Excuse me, while reading the code in your ffa.py file, there is a line of code “attn.div_(0.5)” in the forward function of the PrototypesAssignment class, which you did not mention in your paper. Could you please help me answer this? Thank you