Closed avantikalal closed 4 years ago
I'm not sure I follow this completely. I'm running the example/run.sh script with epochs=25
and --distributed
on ToT dev-v0.2.0. It runs just fine.
Were you running this on NGC ? Does example command run fine for you ?
fixed.
I tried training a model using the
--distributed
flag on 8 GPUs, using the following command:and got the following output:
When
--distributed
was replaced with--gpu 0
, the same command worked fine and trained on 1 of 8 GPUs.