hsahovic / poke-env

A python interface for training Reinforcement Learning bots to battle on pokemon showdown
https://poke-env.readthedocs.io/
MIT License
299 stars 105 forks source link

Trouble getting `examples/rl_with_open_ai_gym_wrapper.py` to work properly #214

Open arcaputo3 opened 3 years ago

arcaputo3 commented 3 years ago

This happens when executed with Python (3.6.13) in a conda environment. I receive the following error:

Exception in thread Thread-6:
Traceback (most recent call last):
  File "C:\Users\rcapu\.conda\envs\poke_env_2\lib\threading.py", line 916, in _bootstrap_inner
    self.run()
  File "C:\Users\rcapu\.conda\envs\poke_env_2\lib\threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\rcapu\.conda\envs\poke_env_2\lib\site-packages\poke_env\player\env_player.py", line 361, in <lambda>
    target=lambda: env_algorithm_wrapper(self, env_algorithm_kwargs)
  File "C:\Users\rcapu\.conda\envs\poke_env_2\lib\site-packages\poke_env\player\env_player.py", line 345, in env_algorithm_wrapper
    env_algorithm(player, **kwargs)
  File "<ipython-input-2-f04533e2348b>", line 83, in dqn_training
    dqn.fit(player, nb_steps=nb_steps)
  File "C:\Users\rcapu\.conda\envs\poke_env_2\lib\site-packages\rl\core.py", line 169, in fit
    action = self.forward(observation)
  File "C:\Users\rcapu\.conda\envs\poke_env_2\lib\site-packages\rl\agents\dqn.py", line 227, in forward
    q_values = self.compute_q_values(state)
  File "C:\Users\rcapu\.conda\envs\poke_env_2\lib\site-packages\rl\agents\dqn.py", line 69, in compute_q_values
    q_values = self.compute_batch_q_values([state]).flatten()
  File "C:\Users\rcapu\.conda\envs\poke_env_2\lib\site-packages\rl\agents\dqn.py", line 64, in compute_batch_q_values
    q_values = self.model.predict_on_batch(batch)
  File "C:\Users\rcapu\.conda\envs\poke_env_2\lib\site-packages\tensorflow\python\keras\engine\training.py", line 1036, in predict_on_batch
    self._make_predict_function()
  File "C:\Users\rcapu\.conda\envs\poke_env_2\lib\site-packages\tensorflow\python\keras\engine\training.py", line 2027, in _make_predict_function
    **kwargs)
  File "C:\Users\rcapu\.conda\envs\poke_env_2\lib\site-packages\tensorflow\python\keras\backend.py", line 3544, in function
    return EagerExecutionFunction(inputs, outputs, updates=updates, name=name)
  File "C:\Users\rcapu\.conda\envs\poke_env_2\lib\site-packages\tensorflow\python\keras\backend.py", line 3429, in __init__
    raise ValueError('Unknown graph. Aborting.')
ValueError: Unknown graph. Aborting.

I am guessing this has to do with a dependency issue - attached is a requirements.txt containing all installed packages:

absl-py==0.14.0
astor==0.8.1
astunparse==1.6.3
backcall==0.2.0
cached-property==1.5.2
cachetools==4.2.2
certifi==2021.5.30
charset-normalizer==2.0.6
clang==5.0
cloudpickle==2.0.0
colorama==0.4.4
dataclasses==0.8
decorator==5.1.0
entrypoints==0.3
flatbuffers==1.12
gast==0.2.2
google-auth==1.35.0
google-auth-oauthlib==0.4.6
google-pasta==0.2.0
grpcio==1.41.0
gym==0.20.0
h5py==3.1.0
idna==3.2
importlib-metadata==4.8.1
ipykernel==5.5.5
ipython==7.16.1
ipython-genutils==0.2.0
jedi==0.18.0
jupyter-client==7.0.5
jupyter-core==4.8.1
keras==2.6.0
Keras-Applications==1.0.8
Keras-Preprocessing==1.1.2
keras-rl2==1.0.3
Markdown==3.3.4
nest-asyncio==1.5.1
numpy==1.16.4
oauthlib==3.1.1
opt-einsum==3.3.0
orjson==3.6.1
parso==0.8.2
pickleshare==0.7.5
poke-env==0.4.19
prompt-toolkit==3.0.20
protobuf==3.18.0
pyasn1==0.4.8
pyasn1-modules==0.2.8
Pygments==2.10.0
python-dateutil==2.8.2
pywin32==301
pyzmq==22.3.0
requests==2.26.0
requests-oauthlib==1.3.0
rsa==4.7.2
six==1.15.0
tabulate==0.8.9
tb-nightly==1.14.0a20190603
tensorboard==2.0.2
tensorboard-data-server==0.6.1
tensorboard-plugin-wit==1.8.0
tensorflow==2.0.0b1
tensorflow-estimator==2.0.1
termcolor==1.1.0
tf-estimator-nightly==1.14.0.dev2019060501
tornado==6.1
traitlets==4.3.3
typing-extensions==3.7.4.3
urllib3==1.26.7
wcwidth==0.2.5
websockets==9.1
Werkzeug==2.0.1
wincertstore==0.2
wrapt==1.12.1
zipp==3.6.0
arcaputo3 commented 3 years ago

I should note that this ONLY occurs in a Jupyter setting. When executed with Python directly there is no issue.

hsahovic commented 3 years ago

Hey @arcaputo3,

I think that this might be a weird interaction caused by Jupyter. I'll take a look at it.

hsahovic commented 3 years ago

I tried getting the example to run in Jupyter, to no avail. Jupyter creates an event loop, which is interfering with the way play_against is currently implemented. This should be fixed in future updates that include large changes to play_against and EnvPlayer in general.

I would recommend not using jupyter to run your training scripts.