how to improve the validation performance with single GPU training

Hello, I have been training using the A6000 with the following command, the dataset is NYUdepthV2:

python -m torch.distributed.launch --nproc_per_node=1 train.py -d 0

The final evaluation results are as follows:

1 wall          81.021%
2 floor         87.021%
3 cabinet        61.932%
4 bed           69.513%
5 chair         63.245%
6 sofa          60.218%
7 table         49.219%
8 door          41.947%
9 window        51.984%
10 bookshelf        47.516%
11 picture        63.291%
12 counter        65.530%
13 blinds        64.546%
14 desk         22.162%
15 shelves        17.098%
16 curtain        66.864%
17 dresser        50.984%
18 pillow        43.180%
19 mirror        47.236%
20 floor mat        41.197%
21 clothes        23.869%
22 ceiling        71.380%
23 books        34.329%
24 refridgerator        61.644%
25 television        56.592%
26 paper        34.894%
27 towel        41.295%
28 shower curtain        40.596%
29 box          12.221%
30 whiteboard        66.973%
31 person        81.384%
32 night stand        48.616%
33 toilet        78.085%
34 sink         58.604%
35 lamp         50.531%
36 bathtub        52.759%
37 bag          15.531%
38 otherstructure        30.959%
39 otherfurniture        19.369%
40 otherprop        40.737%
----------     mean_IoU        50.402%        freq_IoU        63.141%        mean_pixel_acc        62.905%        pixel_acc        76.368%

I read in the issues that to achieve 54% validation performance on mit-b2, it requires using 4 2080 GPUs.

My question is, how much can distributed training improve the final validation performance? Also, if I only have a single GPU, what should I do or what hyperparameters should I use to achieve results close to those reported in the paper? Thank you!

huaaaliu / RGBX_Semantic_Segmentation

how to improve the validation performance with single GPU training #39