Closed sanagno closed 1 year ago
No particular reason. Those layernorms are not used in any of the models we support (i.e., always set to nn.Identity()
) and weren't in the version of timm that we were developing with (0.4.12). Feel free to add them though if there's a model you want to use that has those layernorms.
Thanks a lot!
Is there a particular reason you removed the LayerNorm from the queries and the keys inside the Attention block?
This is the original implementation in timm https://github.com/huggingface/pytorch-image-models/blob/main/timm/models/vision_transformer.py#L83.