SoyGema commented 6 years ago

DQN Implementation Issue description

Context

MACOS with PySC2
Agent development DQN Agent
Agent code here

Problem

Describe the problem you finded :

Function _Functions.Hallucination_Whatever is currently not available. The agent currently makes the Hallucination 3 times and crashes, so the assumption is that it is actually able to make it until it has no energy .
Documentation consulted so far :
- pysc2/lib/features.py ----> line 535 " If a specific action is available, the general will also be available " so there is an assumption that general actions are possible as far as Hallucination is .
- docs/environment.md----> line 388 Specifies the arguments for calling the valid actions , so the arguments in hallucination are _NOT_QUEUED

Issue resolution

define the actions_to_choose function in such a way that the agent doesn´t crash and learns

Trace error

Error showed in screen

FunctionCall(function=<_Functions.Hallucination_Adept_quick: 248>, arguments=[[0]]) FunctionCall(function=<_Functions.Hallucination_Adept_quick: 248>, arguments=[[0]]) FunctionCall(function=<_Functions.Hallucination_Adept_quick: 248>, arguments=[[0]]) Traceback (most recent call last): File "DQN_Agent.py", line 196, in <module> training_game() File "DQN_Agent.py", line 188, in training_game log_interval=1e4, verbose=2) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/rl/core.py", line 168, in fit observation, r, done, info = env.step(action) File "DQN_Agent.py", line 94, in step obs = super(Environment, self).step([action]) File "/Applications/StarCraft II/pysc2/pysc2/lib/stopwatch.py", line 197, in _stopwatch return func(*args, **kwargs) File "/Applications/StarCraft II/pysc2/pysc2/env/sc2_env.py", line 449, in step self._controllers, self._features, self._obs, actions)) File "/Applications/StarCraft II/pysc2/pysc2/lib/run_parallel.py", line 54, in run funcs = [f if callable(f) else functools.partial(*f) for f in funcs]

davidleejy commented 6 years ago

Hello soygema,

Nice work trying to put pysc2 & keras-rl together.

Got the same error when running DQN_Agent_LSTM.py .

Noticed that actions_to_choose() basically replaces whatever action the agent tries to express with "hallucinate adept" action - which then proceeds to fail with the error you mentioned.

As such, I modified DQN_Agent_LSTM.py to make the action replacement (i) random, and (ii) code runs fine. But agent's action still isn't effected into the environment for the time being.

code here: https://pastebin.com/7dB1Y7B1

code is meant to replace DQN_Agent_LSTM.py. Running it is the same as running DQN_Agent_LSTM.py

SoyGema commented 6 years ago

Ey @davidleejy ! Thank you very much for the time dedicated to this issue . Will look at it carefully at get back to you

SoyGema commented 6 years ago

Hi @davidleejy ! Thanks for pushing forward this issue : Let me check with you if I understood your proposal correctly . I've put indices to the points and the questions in order to help you answer

You created the function args_random , which basically helps to generalize among the arguments and choose in between different actions with its corresponding arguments, am I correct (1) ? The abstraction of this function is definetly great!
In the Environment class : You added a new global variable observation_cur for selecting army , help to choose the function_id , and in the end choose the action . This observation_cur comes from obs[0].observation , this refers to the previous step (2) ? (I switched some changes into the agent call )

After cleaning the print statements , switched the call and change the warm-up episodes the training starts correctly . ^^ YEAH!!

First , I would like to thank and congratulate you for helping me solve with challenge!

Here's what I propose you to do :

Send me a PR with the code you alredy have, I can propose you the cleaning we have been talking about ( for you it would be to delete some code )
Send me the PR with the cleaned code alredy . ( clean the prints in step and all def actions to choose )
Send the PR with what you have and I can clean the code ( don't mind to do it, the thing is that you can appear as a contributor of the agent and the repository if you would like)

If not in the mood feel free to tell me ^^

Thanks so much for taking part of the project, it is really significant for me . For Aiur !

davidleejy commented 6 years ago

Hi @SoyGema,

Answers:

args_random. Yes correct. I have a more correct randomization code and will include it in PR. Turns out, _sc2.obs.actionspec needs to be consulted for args to know the screen size & minimap size so as to randomize the spatial coordinates in range [0, screen_size) and [0, minimap_size). Sorry if I sound a little vague, will send you a PR with this more correct randomization included.
observation_cur Yes correct.

Will submit a PR.

SoyGema / Startcraft_pysc2_minigames

actions_to_choose structure #5

DQN Implementation Issue description

Context

Problem

Issue resolution

Trace error