Differences between this implementation and paper(Menghao. PCT: Point Cloud Transformer).

For the first question, since the code of Menghao is directly borrowed and translated from the author's official Jittor implementation (https://github.com/MenghaoGuo/PCT) by the time I implemented this, I don't know if the author is intended to use max pooling or if there is little performance difference between MA-Pool and max pooling.

For the second question, yes, the training schedule for all three methods are different. From my personal view, any over-tuned schedule may hide the actual network architecture contribution. Anyway, I would appreciate if someone could retrain all these models with a cosine annealing schedule, to really compare them in a fair setting.

qq456cvb / Point-Transformers

Differences between this implementation and paper(Menghao. PCT: Point Cloud Transformer). #8