Closed Stavr0sStuff closed 3 years ago
Hello, thanks for your attention. For the questions:
bias=True
and self.q_conv.bias = self.k_conv.bias
. Best Regards, Meng-Hao
Thanks for your answers.
Regarding question 1: my apologies I wrote the question in a confusing way. I was not so much wondering weather a bias should be added for keys and queries, but rather the bias could be removed from the dense later calculating the values (unattended, i.e. before multiplied with the attention based weights). Most implementations of the original 'attention is all you need' paper seem to not use the bias in the value calculation.
Yes, you can remove the bias in the value calculation. In experiments, it seems not necessary in the point cloud transformer.
Hello,
very interesting paper, and nice to publish parts of the code along with it!
A couple of questions:
self.v_conv
has a bias attached to it. Looking at other 'attention' implementations, it seems that those mostly exclude bias from it (as you also do for the keys and queries). Did you see any improvement adding a bias there?Kind regards, steven