can't reproduce the ablation study results in figure 3, 4

Hello,

I can't seem to be able to reproduce the ablation study results in figure 3, 4 of the ICCV paper. When trained and evaluated on an iteration number of 3 (T_train = T_eval = 3), my final mIOU is 76.04%, which is 2.48% much less than the result shown in figure 4 (78.52%).

I used the default settings in settings.py except the following:

N_LAYERS = 50 (experiment done on Resnet-50)
STRIDE = 16 (for training) and 8 (for evaluation) as stated in sec. 6.1
BATCH_SIZE = 12
DEVICE = 0
DEVICES = list(range(0, 1))
NUM_WORKERS = 12

Furthermore, my Pillow version is 6.1.0 and my cv2 version is 3.4.2, unlike the version used by the authors.

Is it possible that using a single GPU to train EMANet results in such a significant decrease in the mIOU (possible due to the use of synchronized batchnorm?) or could using a different version Pillow / cv2 be the root cause of this problem?

Thanks in advance :)

XiaLiPKU / EMANet

can't reproduce the ablation study results in figure 3, 4 #37