Number of points sampled for SUN RGB-D

Divadi commented 3 years ago

Hello! Thank you for open-sourcing your codebase.

I wanted to ask, how many points per scene are you sampling for SUN RGB-D? For the general 3D object detection works, 20k points sampled from 50k during training & testing is standard, but it seems that 40k is used in SESS & 3DIoUMatch.

When looking at SESS's paper: as well as their code: https://github.com/Na-Z/sess/blob/f1bbb44ea6ed73bb71bce12f54ca9bc33746dce8/scripts/run_sess_sunrgbd.py#L16-L26 Default num_points is 40k: https://github.com/Na-Z/sess/blob/f1bbb44ea6ed73bb71bce12f54ca9bc33746dce8/train_sess.py#L34 Input into SUNRGB-D dataset: https://github.com/Na-Z/sess/blob/f1bbb44ea6ed73bb71bce12f54ca9bc33746dce8/train_sess.py#L97-L108 40k is used for SESS.

Similarly in this repo, (I think this repository is based on SESS), https://github.com/THU17cyz/3DIoUMatch/blob/ace9b2e783cd4b9d203998fec516d6c880a22e6d/run_train.sh#L7-L8 Then I think default is used: https://github.com/THU17cyz/3DIoUMatch/blob/ace9b2e783cd4b9d203998fec516d6c880a22e6d/train.py#L38 Then passed into SUNRGB-D dataset, overriding the default parameter of num_points=20000 in sunrgbd https://github.com/THU17cyz/3DIoUMatch/blob/ace9b2e783cd4b9d203998fec516d6c880a22e6d/train.py#L112-L127

Please let me know if I overlooked anything.

Also, would it be possible to have training logs so I can see how performance progresses over the iterations?

yezhen17 commented 3 years ago

Hi @Divadi , I just followed SESS and didn't change the default. I am not sure about how the performance will change if we use 20k.

yezhen17 commented 3 years ago

log_train.txt Training logs for ScanNet 10%

Divadi commented 3 years ago

Thank you so much for your help in this repository & the PV-RCNN version. I will look into the training logs. I have experimented with 20k vs 40k, it seems without changing PointNet parameters, there is no statistically significant difference.

I have tried re-implementing parts of this repository into the mmdetection3d repository. So far, I have a simpler setup: no IoU module & LHS, so just filtering Obj&Cls on VoteNet. (Akin to Row 3 here):

Testing this on SUN RGB-D, however, I often find that although performance does improve for the first perhaps 200~400 so epochs, afterwards, the validation metrics decline. For instance, for the split you released:

AP@.25/,50:
Pretrained: 32.20/13.43
--start SSL training--
Epoch 400 Teacher: 35.22/17.52
Epoch 400 Student: 31.82/10.78
--lr decrease--
Epoch 450 Teacher: 35.02/17.39
Epoch 450 Student: 35.25/15.95
... more training
Epoch 800 Teacher: 33.54/14.57
Epoch 800 Student: 32.44/14.43
... more training
Epoch 1000 Teacher: 32.72/14.84
Epoch 1000 Student: 32.95/15.02

At the first LR decrease, there is a huge improvement in student (this matches your plots on ScanNet in the paper). However, I found the performance steadily degrade with the student maintaining similar #s (which I guess also matches the ScanNet plots). Further, although pre-trained performance is better than paper, final performance is markedly worse. I was wondering if you had encountered anything like this in your experiments, or have any intuition for what may be awry? I have tried a number of different hyperparameter setups, but they seem to often carry this trend.

Again, thank you!

yezhen17 commented 3 years ago

The pretrain performance is higher than that in the paper because you only used one 'lucky' split and the average of three splits is worse.

As for your degrading performance, I have not encountered such trend. There are two things to notice, though: evaluation should use the student model, and the teacher model should be kept in train mode.

Divadi commented 3 years ago

I see, I will generate and try other splits, thank you for your help!

yezhen17 / 3DIoUMatch

Number of points sampled for SUN RGB-D #11