Open rahulagrawal048 opened 2 years ago
I too have had a similar warning when using B0, B1, B5 on both Cityscapes and ADE20K. In my case with B0 on Cityscapes I get single scale miou 75.3 instead of 76.2 stated in the paper. It also seems that per-batch training time is slowed down by this warning/error to a significant degree. I would be greatfull if any suggestions could be provided as to what may be causing this.
With B0 on Cityscapes, what batch size per GPU did you use to get 75.3? Did you change anything else?
@rahulagrawal048 : I used a total batch size of 8 (4 gpus, 2 per gpu). It's also important to mention that I do not use mmseg rather I have made a very carefull introduction of the MiT and Segformer implementations from here to my codebase. I also closely followed the config files in local_configs/ for B0. I thought the error message you mentioned was somehow related to me not using mmseg but it may be a more general issue than that. Do you use mmseg in your reproduction?
Yes I am completely using mmseg and that might be the reason for the low mIoU I get.
Warning: Grad strides do not match bucket view strides.
Were you able to figure out what led to this warning?
I trained the segmenter-B0 model on Cityscapes dataset using 4 GPUs with a sample size per GPU as 2 and get the above warning. Has anyone faced a similar problem or know where the issue might be?
This caused a drop in performance to mIoU ~ 63 on the val set as compared to 76.2 stated in the paper.