Closed shinianzhihou closed 3 years ago
If you're running with a config, you can pass the following key in the task configuration -
{
...
"distributed": {"batch_norm_sync_mode": <sync_mode>} # <sync_mode> can be "pytorch" or "apex"
}
If using the "apex" mode, you can also enable syncing over a smaller group size. For intra node sync batch norm, the setting should like -
{
...
"distributed": {"batch_norm_sync_mode": "apex", "batch_norm_sync_group_size": 8}
}
Hope this helps!
It helps a lot! thx!
How to open Synchronized BN during the training process?
thx!!!