Closed shahzad-ali closed 1 year ago
Hi, @shahzad-ali , it is incompatible if you directly replace layernorm with bacthnorm2d because the dimension arrangements are different. You need to do some permutations for the input tensors accordingly. A simpler way is to use BatchNorm1d. This way you only need to transpose the input in the last two dimensions before each norm and transpose back right afterward.
One more comment is that it has been demonstrated that our FocalNets with layernorm work pretty well for semantic segmentation on ADE20K. You may have a direct try with the provided checkpoints without any changes.
Excellent work!
By default, layer normalization is used as
FocalNet(norm_layer=nn.LayerNorm)
. I'm wondering if it's a better choice for a semantic segmentation task. I would love to hear some thoughts on this.Strangely enough, setting
norm_layer=nn.BatchNorm2d
caused several errors sincex
innn.BatchNorm2d(embed_dim)(x)
was found to be a 3D Tensor withembed_dim
as its last dimension.nn.LayerNorm
is supposed to be the default normalization for FocalNet, then why do we even have it as an input parameter?Looking forward to getting awesome replies. Thanks!