Closed abergeron closed 8 years ago
I addressed the comments and ran a basic test of the ASGD rule with the LSTM examples. It's still running, but it did a couple of updates with success. I just don't know if it's going in the right direction yet.
I tried to do worker.close()
, and got an error:
Traceback (most recent call last):
File "/home/rizar/Dist/blocks-examples/mnist/__init__.py", line 139, in <module>
args.learning_rate, args.sync_freq, args.alpha)
File "/home/rizar/Dist/blocks-examples/mnist/__init__.py", line 118, in main
worker.close()
File "/home/rizar/Dist/platoon/platoon/channel.py", line 461, in close
self._shmref.unlink()
posix_ipc.ExistentialError: No shared memory exists with the specified name
I've made a fix for that.
The example seems to work with the new ASGD rule. I did not check if training performance is any better.
This also has a implementation of ASGD in the last commit.