chainer / chainerrl

ChainerRL is a deep reinforcement learning library built on top of Chainer.
MIT License
1.17k stars 224 forks source link

Evaluating a recurrent model before async training results in crash #114

Closed muupan closed 6 years ago

muupan commented 7 years ago

When you modify examples/gym/train_a3c_gym.py like this, the program will silently crash after creating subprocesses. This is the case even when you call reset_state.

https://github.com/muupan/chainerrl/commit/c14d5fd88734fa8f1fd0de1d380b1cec5a586f72

$ python examples/gym/train_a3c_gym.py 4 --arch LSTMGaussian --env Pendulum-v0
[2017-06-16 16:48:18,237] Making new env: Pendulum-v0
[2017-06-16 16:48:18,622] Making new env: Pendulum-v0
[2017-06-16 16:48:18,622] Making new env: Pendulum-v0
[2017-06-16 16:48:18,625] Making new env: Pendulum-v0
[2017-06-16 16:48:18,626] Making new env: Pendulum-v0
[2017-06-16 16:48:18,647] Making new env: Pendulum-v0
[2017-06-16 16:48:18,649] Making new env: Pendulum-v0
[2017-06-16 16:48:18,650] Making new env: Pendulum-v0
[2017-06-16 16:48:18,665] Making new env: Pendulum-v0
# The program silently crashes!

Evaluating a model before training makes sense if you use placeholders to define the model, so this behavior should be fixed.

muupan commented 7 years ago

This happened in OS X 10.11.6 with python 3.5.1 but did not happen in Ubuntu 14.04 with python 3.5.2.

muupan commented 7 years ago

In the same OS X machine, using pyenv, 3.6.1 crashed but anaconda3-4.4.0 didn't. Both use python 3.6.1.

muupan commented 6 years ago

This seems due to Accelerate not working correctly with multiprocessing. https://github.com/ContinuumIO/anaconda-issues/issues/133

Chainer discourages using Accelerate, so use a differerent backend for numpy.