In the README, you mentioned that you would optimize the training code using DeepSpeed and Accelerate. However, as far as I know, the DeepSpeed functionality integrated into the Accelerate library does not support multi-model training. Do you have any suggestions about use deepspeed to optimize the memory?
In the README, you mentioned that you would optimize the training code using DeepSpeed and Accelerate. However, as far as I know, the DeepSpeed functionality integrated into the Accelerate library does not support multi-model training. Do you have any suggestions about use deepspeed to optimize the memory?