Accelerate allows training/inference of large models by automagically splitting the layers across CUDA devices. Initially, we had some issues with logprobs due to model forward patching. Now that we generally use transition scores, it may work with accelerate.
Accelerate allows training/inference of large models by automagically splitting the layers across CUDA devices. Initially, we had some issues with
logprobs
due to model forward patching. Now that we generally use transition scores, it may work with accelerate.