csrhddlam / axial-deeplab

This is a PyTorch re-implementation of Axial-DeepLab (ECCV 2020 Spotlight)
https://arxiv.org/abs/2003.07853
Apache License 2.0
447 stars 69 forks source link

why batchnormalization after qkv transform? #33

Open lkqnaruto opened 3 years ago

lkqnaruto commented 3 years ago

I wonder why batchnormalization after qkv transform? is it because of the covariate shift issue?

https://github.com/csrhddlam/axial-deeplab/blob/79088edb4bdb8c94351d85f54272ec12b9e79c8b/lib/models/axialnet.py#L31-L34

How does batchnorm2D work for calculating the similarity score? It really confused me.

Thanks