Closed yaozengwei closed 1 year ago
Hi @yaozengwei ,
Yes, you are right. The softmax should be in the second dimension, this is an implementation issue. Also, you can replace A=A.softmax(dim=-1)
by A=A.softmax(dim=1)
or A=torch.nn.functional.normalize(A, dim=1)
.
I will update the model weights and code soon with this change.
Best regards, Abdelrahman.
Thanks for your quick response!
I doubt there is a mismatching between the code and the paper. For line 175, the shape of A is (B, H*W, 1). I think it should be
A = A.softmax(dim=1)
, so that the softmax operation is applied over the spatial dimension (i.e., H*W). Please correct me If I'm mistaken.https://github.com/Amshaker/SwiftFormer/blob/075daf69f8959052dfaf7a1e537009304a17f9ce/models/swiftformer.py#L172-L177