Is the type of Multi GPU training that was implemented in this project Async SGD or Sync SGD? Does tensorflow protect automatically against stale gradients? Just wondering for one of my own projects. I have implemented Sync SGD using the multi gpu cifar example, and another using a similar technique to the one in this repo. The one of this repo appears to be far more adjustable and ends up being faster since the gradients are left to tf to setup. Do you know if this is Async or Sync SGD? Please let me know if you do.
Is the type of Multi GPU training that was implemented in this project Async SGD or Sync SGD? Does tensorflow protect automatically against stale gradients? Just wondering for one of my own projects. I have implemented Sync SGD using the multi gpu cifar example, and another using a similar technique to the one in this repo. The one of this repo appears to be far more adjustable and ends up being faster since the gradients are left to tf to setup. Do you know if this is Async or Sync SGD? Please let me know if you do.