Closed LTnanana closed 3 years ago
Hi, we still follow the setting in ImageNet that sw=[1,2,7,7], enlarge it will lead to better performance, but the GPU memory cost and flops will also increase.
Hi @LTnanana, did sw=[1,2,7,7] work for you? or did you change the split size? Hope to get a reply!
Thanks
Hi, we still follow the setting in ImageNet that sw=[1,2,7,7], enlarge it will lead to better performance, but the GPU memory cost and flops will also increase.
Hi, I try to enlarge the split size but the performance drops a little. Do you have some results to demonstrate it? Or provide some trained models with larger split size. Besides, I find that the last stage always use the global self-attention no matter what the split size is. Looking forwards to your reply!
Hi, we still follow the setting in ImageNet that sw=[1,2,7,7], enlarge it will lead to better performance, but the GPU memory cost and flops will also increase.
My question: is it possible to use split size = 7 for the 512x512? I tested such scenario and got an error Exception: shape '[1, 192, 1, 32, 4, 7]' is invalid for input of size 196608
due to the fact, that 32 is not divisible on 7. When replaced 7 by 8, everything worked.
Hi, I wonder the split size when training on ADE20K with input size 512x512. Thanks!