AssertionError: You can't use same `Accelerator()` instance with multiple models when using DeepSpeed

KimMeen / Time-LLM

[ICLR 2024] Official implementation of " 🦙 Time-LLM: Time Series Forecasting by Reprogramming Large Language Models"

https://arxiv.org/abs/2310.01728

Apache License 2.0

1.02k stars 179 forks source link

AssertionError: You can't use same `Accelerator()` instance with multiple models when using DeepSpeed #87

Closed well0203 closed 2 weeks ago

well0203 commented 1 month ago

Hi, thank you for your work again. I have another question: How did you train models with multiple iterations if "AssertionError: You can't use same Accelerator() instance with multiple models when using DeepSpeed." I set --itr 2 and got this error. I am wondering, how you solved this issue? Did you just run main scripts one by one and then averaged results with different seeds, or you fixed it somehow in main script so that it supports multiple models/iterations? I am asking, because I did not find stable solution or fix around, because DeepSpeed currently supports only one model. Thank you in advance.

kwuking commented 1 month ago

Hi, we obtain the final results by running multiple epochs. Additionally, we will try to repeat this experiment multiple times to obtain several sets of results and calculate the average value.

well0203 commented 1 month ago

Hi, we obtain the final results by running multiple epochs. Additionally, we will try to repeat this experiment multiple times to obtain several sets of results and calculate the average value.

Sorry, I did not understand your answer. I mean, you run a model multiple times and then average results (from your paper). In the run_main.py script you have --itr parameter to perform same experiment multiple times (run the same model with same parameters more than once). But when I set itr to more than 1, I get this error, because Accelerate does not support multiple models. I hope I could clarify my previous question.