Closed sh0416 closed 1 year ago
I am wondering whether this codebase has pipeline parallelism technique or not.
https://huggingface.co/docs/transformers/perf_train_gpu_many#dppp
I think jax doesn't need manual pipeline parallelism.
I am wondering whether this codebase has pipeline parallelism technique or not.
https://huggingface.co/docs/transformers/perf_train_gpu_many#dppp