MenghaoGuo / PCT

Jittor implementation of PCT:Point Cloud Transformer
661 stars 80 forks source link

Use of bias in value layer #4

Closed Stavr0sStuff closed 3 years ago

Stavr0sStuff commented 3 years ago

Hello,

very interesting paper, and nice to publish parts of the code along with it!

A couple of questions:

Kind regards, steven

MenghaoGuo commented 3 years ago

Hello, thanks for your attention. For the questions:

  1. In experiments, we do not observe a significant improvement by setting bias=True and self.q_conv.bias = self.k_conv.bias.
  2. Using this initialization can ensure a reasonable attention map at the beginning of training, which can improve the stability of training processing.
  3. We use Farthest Point Sampling (FPS) in the 3-dimensional European space. Moreover,, it may get better performance by using FPS in high-dimensional space.

Best Regards, Meng-Hao

ds-steventondeur commented 3 years ago

Thanks for your answers.

Regarding question 1: my apologies I wrote the question in a confusing way. I was not so much wondering weather a bias should be added for keys and queries, but rather the bias could be removed from the dense later calculating the values (unattended, i.e. before multiplied with the attention based weights). Most implementations of the original 'attention is all you need' paper seem to not use the bias in the value calculation.

MenghaoGuo commented 3 years ago

Yes, you can remove the bias in the value calculation. In experiments, it seems not necessary in the point cloud transformer.