FabianFuchsML / se3-transformer-public

code for the SE3 Transformers paper: https://arxiv.org/abs/2006.10503
475 stars 69 forks source link

Reproducibility on ScanObjectNN #30

Closed Siyeong-Lee closed 1 year ago

Siyeong-Lee commented 1 year ago

Hi, Fabian!

I am experimenting with ScanObjectNN by setting the output channel of the model to 15 for the QM9 dataset and changing the MAE loss to cross-entropy. Additionally, the batch size is set to 4 and the learning rate is modified accordingly.

For the input values, I set the edge features to {} and the node features to {0: normalized positions of point}, and used the graph generated by KNN (k=10). Under these settings, I am not able to restore the paper's experimental results.

Please let me know if there is anything I should be concerned about.

Thank you.

FabianFuchsML commented 1 year ago

Hi Siyeong-Lee,

Happy to hear that you are interested in our work!

The relevant sections in the paper are 4.2 and D.1 in the appendix, where we discuss a number of details. Particularly important is the bit about symmetry breaking. This is an artefact of object alignment with the gravitational axis and the depth dimension of rgbd cameras.

the batch size is set to 4

This seems quite low to me and might hurt performance. We used batch size 10.

For the input values, I set the edge features to {}

I would advice to use the distances between pairs of points as edge features.

In general, bear in mind that this repository is a reimplementation of the original code for IP reasons. Hence, optimal hyperparameters might very well be different depending on how accurately we managed to re-implement. E.g. differences in parameter initialisations might mean different optimal learning rates etc.

However, since Nvidia published the accelerated version, I would recommend anyone to use their version of the code. They managed to speed up training of the SE(3)-Transformer by up to 21 times and reduced memory consumption by up to 43 times, which is obviously amazing. This will allow you to experiment much more quickly and try out common tricks in point cloud classification such as data augmentation, pre-training, ensembling, etc…

https://developer.nvidia.com/blog/accelerating-se3-transformers-training-using-an-nvidia-open-source-model-implementation/

Best, Fabian

Siyeong-Lee commented 1 year ago

Thank you for the reply. Your comments have been very helpful.