Open champson opened 1 year ago
I'm also very interested in this issue. It would be great to get a clear response from the community on this matter. Thanks!
@champson, @yefanhust are there specific models/scenarios you are looking to apply pipeline parallelism for. The scenarios that PP is helpful for inference is very narrow, and applicable in just a handful of cases currently, so we have de-prioritized releasing these features. But we can re-visit this if there is a strong interest in the community for these features.
Hi!I am also interesting in this feature described in this paper, is there any demo or tutorial for 'hybrid pipeline inference schedule'?
As mentioned in the paper https://arxiv.org/abs/2207.00032, DeepSpeed inference supports pipeline parallelism, including hybrid scheduling, offloading activations, and communication, which have led to significant performance improvements. However, does DeepSpeed currently support these features? If not, is there a timeline for when they will be supported?