How to eval multi batches

huixiancheng commented 3 years ago

Hi.Dear @DeyvidKochanov-TomTom 1.With your generous help,I have try to train my model with a single GPU.But in the eval part, there are have a error that I can't debug? （In fact it does not affect training） QQ截图20210305202455 https://github.com/DeyvidKochanov-TomTom/kprnet/blob/587a616b955b6be55b223e031e9ed39a8588d149/train_kitti.py#L94-L102 If set eval batch_size!=1 ,it will raise this error since the labels.size（or the size of points） in a batch isn't totally same. Could this errors fix with simple way?Since multi_batch could improve the speed of training. 2.Why is it to normalize values to be between -10 and 10 ? https://github.com/DeyvidKochanov-TomTom/kprnet/blob/587a616b955b6be55b223e031e9ed39a8588d149/datasets/semantic_kitti.py#L113-L116 Actually in RangeNet ++ ，they use this way to standardization input.

proj = torch.cat([proj_range.unsqueeze(0).clone(),
                      proj_xyz.clone().permute(2, 0, 1),
                      proj_remission.unsqueeze(0).clone()])
proj = (proj - self.sensor_img_means[:, None, None]
            ) / self.sensor_img_stds[:, None, None]
proj = proj * proj_mask.float()

The value is： img_means: #range,x,y,z,signal - 12.12 - 10.88 - 0.23 - -1.04 - 0.21 img_stds: #range,x,y,z,signal - 12.32 - 11.47 - 6.91 - 0.86 - 0.16 How about the difference of these two ways？More like a black magic way.

huixiancheng commented 3 years ago

is this the reason that we must use 8 GPUs?:sob:

DeyvidKochanov-TomTom commented 3 years ago

:fearful: The val doesn't run in parallel on the 8 GPUs, so yes, it will typically take a lot of the time required to train the model. You can make it happen more rarely. If you want to run it on multiple examples you will have to add the 3D point padding logic from the train transform to the test transform.

As for the normalization the depth normalization seems to be the same except the scales are different? I don't think the scale difference would have a huge effect. The mean and std of the reflectivity seem different, I think in the KPR pipeline its normalized based on the min max values, this could make some difference so might be a good thing to try out.

If you want to train on 1 GPU it might be a good idea to switch the backbone. The KPConv layer is also not that small, I think because it unrolls over all the nearest neighbors but not much can be done about that :grin:

huixiancheng commented 3 years ago

will .I will try to padding it. Yes,i have try other backbone. Thank you very much,Sir！

DeyvidKochanov-TomTom / kprnet

How to eval multi batches #11