Closed zaccharieramzi closed 4 years ago
Introducing model sequentialism to have different blocks of the model placed on different GPUs to benefit from more memory.
A little communication overhead is expected so computation should be slower.
Introducing model sequentialism to have different blocks of the model placed on different GPUs to benefit from more memory.
A little communication overhead is expected so computation should be slower.