Closed AlimTleuliyev closed 1 year ago
I have the same problem, have you solved it? Thank you
@sunmooncm No, I could not find the solution :(
1) do you observe any error in single GPU mode? 2) I suggest that you wrap the code in the main() block, to have a single entry point.
Π²Ρ, 27 ΡΠ΅ΡΠ². 2023 Ρ. ΠΎ 20:11 Alim Tleuliyev @.***> ΠΏΠΈΡΠ΅:
@sunmooncm https://github.com/sunmooncm No, I could find the solution :(
β Reply to this email directly, view it on GitHub https://github.com/Deci-AI/super-gradients/issues/1152#issuecomment-1609921209, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEB6YAW3DZFLPM4KJ3MBQTXNMH5JANCNFSM6AAAAAAY7WJQDU . You are receiving this because you are subscribed to this thread.Message ID: @.***>
I want to point out two things:
1) If you're trying to run DDP from Jupyter - this is not supported at the moment. There are workarounds (https://www.kaggle.com/code/onodera/ddp-example) but we don't provide support for this method.
2) You should be using main() to setup DDP correctly - check our entry point for train_from_recipe
script: https://github.com/Deci-AI/super-gradients/blob/f82a3b462b0ba8b05499c8aafb9f9500370e1fb9/src/super_gradients/train_from_recipe.py
π‘ Your Question
I am trying to use multigpu training using distributed data parallel strategy
It gives this error: