Open Jason3900 opened 2 weeks ago
Do you really need to call prepare
multiple times? You should be able to run prepare
in a single call, right?
return_values = self.accelerator.prepare(*accelerator_to_prepare.values())
for k, val in zip(accelerator_to_prepare.keys(), return_values):
setattr(self, k, val)
Yeah, it's okay. But I think it would be nicer if you pointed it out in the documentation or fixed the logic internally. Otherwise, it might be confusing, and users might struggle to find the problem.
The deepspeed init logic is probably not easy to fix, but I'll wait for Zach's return to comment on that. Regarding the docs, yes, probably it should be highlighted that to be on the safe side, there should be a single prepare
call containing all that's required.
System Info
Information
Tasks
no_trainer
script in theexamples
folder of thetransformers
repo (such asrun_no_trainer_glue.py
)Reproduction
Hey, since I may want to prepare only certain items depending on my training arguments (suppose I don't want to prepare scheduler this time), I decided to order them in a dict and call prepare function multiple time as the number of items are not fixed. After that, I use setattr to re-allocate them to their namespace. It works perfectly util I want to change my code to support deepspeed plugin.
In accelerator's _prepare_deepspeed function, it captures the prepared items and finds the corresponding optimizer and scheduler, then catch the kwargs passed to them to feed in deepspeed config to make all things work. But In my case, I call the accelerate prepare method multiple times, it only captures the last time call, which means the result only contains one item (
[model]
in my case). Thus it cannot successfully find out the kwargs needed by the optimizer and scheduler (because they're set to "auto" in deepspeed config). Which make the deepspeed_config_process failed with error.Expected behavior
I think accelerate should handle this scenario.