Open wntg opened 6 months ago
Accelerate makes it trivial to scale from a single machine to multiple machines with the same code. In this regard, the training code is generalisable, meaning the same training loop can be used agnostic of the machine https://huggingface.co/docs/accelerate/basic_tutorials/launch
Therefore, users can leverage whatever compute they have available, with no code changes to the distillation code
Accelerate is a tool for multi-machine,but why you use it in single gpu?