Closed tangbohu closed 4 years ago
Hi @tangbohu Thanks for your interest. For the last layer, I follow the design of HardNet and L2Net in the 2D descriptor learning, where the ReLU is omitted. For BN, since we are learning a representation for each point, adding BN seems to hurt the representation power of the final descriptor (but some other works do add the bn for the last layer). And actually I have tried architectures both with and without BN and ReLU for the last layer, and this setting has quite a negligible effect on the final result if I remember correctly.
Thanks for your great work! However, I am very curious about the sentense in the paper "All layers except the last one are followed by batch normalization and ReLU". I wonder why BN and ReLU are not adopted as in 2D CNNs. Looking forward to your kind reply!