kakaobrain / torchgpipe

A GPipe implementation in PyTorch
https://torchgpipe.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
814 stars 98 forks source link

[Question] Inference time speed up or not? #31

Closed tuanmanh1410 closed 3 years ago

tuanmanh1410 commented 3 years ago

Thank for sharing this project and paper I'm using GPipe Pytorch to testing inference time the same test dataset and compare with running on single GPU as baseline. 1/ The inference time running with GPiPe is seem slower than single GPU. Therefore , GPipe is suitable for training large model ? and not effective for speed up inference time ? Please correct me if I'm wrong. 2/ I'm curious that Does GPipe library support computes the communication latency among GPUs when intermediate data is transmitted between 2 GPUs in a row?

Thank you

sublee commented 3 years ago

The purpose of GPipe is:

If your model isn't large enough, you don't need GPipe.

For more details, see:

Also, GPipe itself does not provide latency measurement. We'd used NVIDIA Nsight Systems to optimize its communication cost.