Typically, we try to remove positional embeddings so that the model can work with different resolutions but I heard a lot of people suggesting relative positional embeddings https://github.com/ofirpress/attention_with_linear_biases
It might be worth a try if we get time
Typically, we try to remove positional embeddings so that the model can work with different resolutions but I heard a lot of people suggesting relative positional embeddings https://github.com/ofirpress/attention_with_linear_biases It might be worth a try if we get time