Closed JunzheJosephZhu closed 2 years ago
As in title, can you replace the standard ViT encoder with a swin transformer + FPN? Would this be a reasonable thing to try out?
You could probably replace ViT with a swin encoder indeed :) You could get improved performance given the downstream performances of Swin that are better than ViT on some tasks.
As in title, can you replace the standard ViT encoder with a swin transformer + FPN? Would this be a reasonable thing to try out?