Open venkycreator opened 1 month ago
@venkycreator This example doesn't offer any data parallelism. It could be added wrapping the model with PyTorch DistributedDataParallel.
Tensor parallelism is possible either using DeepSpeed or adding the argument --parallel_strategy tp
.
System Info
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
I wanted to know in order to rum the models parallel using deepspeed
Expected behavior
wanted to know what kind of parallelism it supports