germain-hug / Deep-RL-Keras

Keras Implementation of popular Deep RL Algorithms (A3C, DDQN, DDPG, Dueling DDQN)
528 stars 149 forks source link

A3C issues #9

Closed jarlva closed 5 years ago

jarlva commented 5 years ago

Hi, I noticed that A3C is having 2 issues:

  1. CTRL-C won't stop the script. Had to kill my CMD process.
  2. At the end, while rendering the test results it spits out:

(36env) c:\Users\user\py\Deep-RL-Keras>main.py --type A3C --env CartPole-v0 --nb_episodes 10000 --n_threads 2 Using TensorFlow backend. Score: 40.0: : 5050 episodes [00:04, 1259.48 episodes/s] Traceback (most recent call last): File "C:\Users\user\py\Deep-RL-Keras\main.py", line 114, in <module> main() File "C:\Users\user\py\Deep-RL-Keras\main.py", line 107, in main a = algo.policy_action(old_state) File "C:\Users\user\py\Deep-RL-Keras\A3C\a3c.py", line 61, in policy_action return np.random.choice(np.arange(self.act_dim), 1, p=self.actor.predict(s).ravel())[0] File "C:\Users\user\py\Deep-RL-Keras\A3C\agent.py", line 22, in predict return self.model.predict(self.reshape(inp)) File "C:\Users\user\py\36env\lib\site-packages\keras\engine\training.py", line 1149, in predict x, _, _ = self._standardize_user_data(x) File "C:\Users\user\py\36env\lib\site-packages\keras\engine\training.py", line 751, in _standardize_user_data exception_prefix='input') File "C:\Users\user\py\36env\lib\site-packages\keras\engine\training_utils.py", line 128, in standardize_input_data 'with shape ' + str(data_shape)) ValueError: Error when checking input: expected input_1 to have 3 dimensions, but got array with shape (4, 4) Exception ignored in: <bound method Viewer.__del__ of <gym.envs.classic_control.rendering.Viewer object at 0x0000022E46D94BA8>> Traceback (most recent call last): File "c:\users\user\py\gym\gym\envs\classic_control\rendering.py", line 143, in __del__ File "c:\users\user\py\gym\gym\envs\classic_control\rendering.py", line 62, in close File "C:\Users\user\py\36env\lib\site-packages\pyglet\window\win32\__init__.py", line 305, in close File "C:\Users\user\py\36env\lib\site-packages\pyglet\window\__init__.py", line 770, in close ImportError: sys.meta_path is None, Python is likely shutting down

germain-hug commented 5 years ago

Hi,

  1. CTRL-C won't stop the script. Had to kill my CMD process.

It looks like this stems from the threads not being run in daemon mode, I will try to look into it.

  1. At the end, while rendering the test results it spits out:

I believe the last part of the error comes from the environment not being closed properly. I added env.env.close() at the end of main, let me know if that solves the issue.

jarlva commented 5 years ago

Hi Germain, I noticed another thing about A3C. It goes beyond the episodes in the command line. Example below shows 10000 episodes defined . After the 10000 episodes the progress bar disappears but the command continues to execute until it finished by itself at 495673 episodes. It happens with both the old and new script. (36env) c:\Users\Jake\py\Deep-RL-Keras>main.py --type A3C --env MyCartPole-v0 --batch_size 64 --nb_episodes 10000 --consecutive_frames 4 Using TensorFlow backend. Score: 0.869877360644808: : 495673 episodes [12:45, 647.90 episodes/s]

The error in the initial issue (top) is gone. It seems the close() did the job. Ctrl-C still not 100%. Thanks, Jake

jarlva commented 5 years ago

Hi, I tried the last ctrl-c change. the script exits a few seconds after it starts by itself.

germain-hug commented 5 years ago

I just updated the code, the ctrl-c should work fine now. I also fixed a bug in the incrementation of the episode counter, it should not go over it anymore.

jarlva commented 5 years ago

Hi Germain, tried the ctrl-c after doing git pull but still not stopping the script. I am running on Windows 10 with Python 3.6.6 and latest tensorflow/keras.

germain-hug commented 5 years ago

Hi, I'm running on OS X with Python 3.6.5 and it stops instantly for me...what is the outputted error?

jarlva commented 5 years ago

No output error since it does not respond to ctrl-c and the only way to get out is to wait for it to finish (no error) or kill the process. It seems like a window-specific error.

On another subject, I'm getting good results from A2C (thank you!!). I don't see a place in the code to test and view it playing after training (saved weights or otherwise). Any chance you can add it?

germain-hug commented 5 years ago

No output error since it does not respond to ctrl-c and the only way to get out is to wait for it to finish (no error) or kill the process. It seems like a window-specific error.

Perhaps you can see if there's a way to fix it on your side (it should be linked with KeyboardInterrupt line 128 of a3c.py)

On another subject, I'm getting good results from A2C (thank you!!). I don't see a place in the code to test and view it playing after training (saved weights or otherwise). Any chance you can add it?

You're welcome! I've added a separate file for visualisation (load_and_run.py), details about the arguments to pass can be found in the readme

jarlva commented 5 years ago

I tried some options with KeyboardInterrupt but I'm not a python expert.

I tried load_and_run.py and it worked perfectly!