Toolkit for running MXNet training scripts on SageMaker. Dockerfiles used for building SageMaker MXNet Containers are at https://github.com/aws/deep-learning-containers.
Description of changes:
Add a class for users to use to set up a parameter server like before to use with distributed training.
A few note:
this doesn't work if you have only one host but still want the parameter server. Likely an issue with a process ending before it should, but I'm not sure yet. will investigate further.
naming is hard. feel free to suggest alternatives :)
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
Description of changes: Add a class for users to use to set up a parameter server like before to use with distributed training.
A few note:
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.