Closed mental2008 closed 2 years ago
As mentioned in the paper, Swin Transformer is a new vision transformer (ViT), which serves as a general-purpose backbone for computer vision.
"Interesting" points:
The model architecture is as follows:
Not read the details of the paper.
Presented in ICCV '21. [ Paper | Supplement | arXiv | Code ] Awarded Best Paper (Marr Prize)!
Authors: Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, Baining Guo Microsoft Research Asia, University of Science and Technology of China, Xian Jiaotong University, Tsinghua University