aharley / simple_bev

A Simple Baseline for BEV Perception
MIT License
502 stars 79 forks source link

Questions on architecture design choices #24

Closed Kait0 closed 1 year ago

Kait0 commented 1 year ago

Hi I have a couple of question regarding your NN architecture design and I would like to ask if you could give the the motivation for these particular design choices (or if they are copied from some other work point me to it):

For both ResNet backbones you stopped at the 3rd block and did not use the 4th block: https://github.com/aharley/simple_bev/blob/be46f0ef71960c233341852f3d9bc3677558ab6d/nets/segnet.py#L164 https://github.com/aharley/simple_bev/blob/be46f0ef71960c233341852f3d9bc3677558ab6d/nets/segnet.py#L68

What is the motivation for the use of instance normalization in the decoders? https://github.com/aharley/simple_bev/blob/be46f0ef71960c233341852f3d9bc3677558ab6d/nets/segnet.py#L78

Why did you not use activation functions for the up-sampling layers in the BEV grid? https://github.com/aharley/simple_bev/blob/be46f0ef71960c233341852f3d9bc3677558ab6d/nets/segnet.py#L45

aharley commented 1 year ago

Great questions.

3rd block: I think we did this so that the resolutions would match up with the EfficientNet versions (which came from FIERY).

Instance norm: This probably makes only a tiny difference in practice. In general I prefer instancenorm over batchnorm because instancenorm is friendlier to experiments with low batch sizes.

No activation in upsampling: copied from FIERY: https://github.com/wayveai/fiery/blob/master/fiery/layers/convolutions.py

Kait0 commented 1 year ago

Thanks.