Hello, I feel confused about the crop size. When I run segmention demo, I find Beit process img in (512,512), but in vit-adapter, crop size usually was set in (896,896), why this size was selected? and is any association between 512 and 896?, Looking forward to your response, thanks!
Crop size 896 was first adopted in the SwinV2 paper, and in order to obtain higher mIoU performance, we also adopted this setting in some models to improve performance.
Hello, I feel confused about the crop size. When I run segmention demo, I find Beit process img in (512,512), but in vit-adapter, crop size usually was set in (896,896), why this size was selected? and is any association between 512 and 896?, Looking forward to your response, thanks!