valeoai / RADIal

147 stars 50 forks source link

FPN Implementation #24

Closed unfazing closed 1 year ago

unfazing commented 1 year ago

Good day! Would like to clarify the implementation of the feature pyramid network. Conventionally the FPN has lateral connection between the bottom-up pyramid and the top-down pyramid. Where are these lateral connections in FFTRadNet?

** Edit: So I figured the Range Angle Decoder is actually the top-down path for the FPN and the lateral connections were made when the output of block 2, 3 and 4 of the FPN was concatenated with the outputs of the top-down path.

Conventionally FPN predicts with each level of output, on different resolution whereas in this case, only the final output of the top down path of the RA decoder is used for prediction. Would it be more accurate to consider this a U-Net?

Apologies for the questions that may seem daft, I am new to this sphere.

ArthurOuaknine commented 1 year ago

Hi. Sorry for the late reply. As you said, the top-down part of the FPN is the "FPN Encoder" and the bottom-up part is in the "Range-Angle Decoder". You can refer to Figure 3 of our article for the scheme: https://openaccess.thecvf.com/content/CVPR2022/papers/Rebut_Raw_High-Definition_Radar_for_Multi-Task_Learning_CVPR_2022_paper.pdf

The lateral connections are represented as black arrows and link the top-down and bottom-up pathways. Indeed, we do not predict at each level of the bottom-up pathway so we could have used a U-Net like architecture on this part. The two models are quite similar. However the FPN architecture processes the top-down feature maps with 1x1 convolution before concatenating them with the bottom-up pathway. We used this process to adjust the dimension of the azimuth axis of the tensor, while learn a combination of the features, before swapping the tensor (see Section 4.3). Note that we had to adapt and swap the dimensions to obtain a Range-Azimuth map from a Range-Doppler-Azimuth tensor.

I hope it will help.