uber-research / UPSNet

UPSNet: A Unified Panoptic Segmentation Network
Other
648 stars 120 forks source link

Training is slow. #8

Closed dxbdxx closed 5 years ago

dxbdxx commented 5 years ago

Hello, I am trying to reproduce the results without horovod. I use 4 Tesla K80 gpus (12GB) and train the net with "upsnet_resnet50_coco_4gpu.yaml" but I find that it may take more than 10 days for training. Have you got some advice for speeding up the training? Thanks.

YuwenXiong commented 5 years ago

K80 is ~2x slower than 1080 Ti (5.6 TFLOPS vs 11.3 TFLOPS). And we suggest use 8-16 GPUs for coco experiments. I think on COCO 10 days for 4 K80 GPU is a normal time. Please try smaller dataset such as Cityscapes.

dxbdxx commented 5 years ago

Thank you. I'll try this.

lxtGH commented 5 years ago

@YuwenXiong Hi, What using 4 1080TI GPU on COCO? How long does it cost on COCO? I have tried to train cityscapes which cost me about 2 days.

YuwenXiong commented 5 years ago

4 1080Ti for cityscapes should be done < 1day, and for coco it will be ~7 days