ResNet + Pyramid Vision Transformer Version 2

The-AI-Summer / self-attention-cv

Implementation of various self-attention mechanisms focused on computer vision. Ongoing repository.

https://theaisummer.com/

MIT License

1.18k stars 154 forks source link

ResNet + Pyramid Vision Transformer Version 2 #8

Closed khawar-islam closed 3 years ago

khawar-islam commented 3 years ago

Thank you for your work with a clear explanation. As you know, ViT doesn't work on small datasets and I am implementing ResNet34 with Pyramid Vision Transformer Version 2 to make it better. The architecture of ViT and PVT V2 is completely different. Could you provide me some help to implement it? please

black0017 commented 3 years ago

Hello have you checked the authors' repo https://github.com/whai362/PVT ?

Is the model you reference (ResNet34 with Pyramid Vision Transformer Version 2 ) described or implemented in the paper?

I am also looking on the pyramid vision transformer concept. @khawar512

khawar-islam commented 3 years ago

No, they did not use the ResNet18 model with PVT but I want to use it. do you have any idea? We can work together and discuss PVT concepts.

black0017 commented 3 years ago

do you have any intuition why this would help you solve your task? i would like to add some version of pvt but i am currently short on time