inoryy / reaver

Reaver: Modular Deep Reinforcement Learning Framework. Focused on StarCraft II. Supports Gym, Atari, and MuJoCo.
MIT License
554 stars 89 forks source link

Marine stuck in 'MoveToBeacon' #16

Closed King-Of-Knights closed 5 years ago

King-Of-Knights commented 5 years ago

Hi there! Thanks for your great work! 👍 But I met some unexpected problem, my environment is Window 10. my code is as follow:

import reaver as rvr
from multiprocessing import Process

if __name__ == '__main__':
    p = Process()
    p.start()
    env = rvr.envs.SC2Env(map_name='MoveToBeacon')
    agent = rvr.agents.A2C(env.obs_spec(), env.act_spec()
                           , rvr.models.build_fully_conv, rvr.models.SC2MultiPolicy, n_envs=1)
    agent.run(env)

But I got those Traceback and Marine just will not move to anywhere:

Process Process-2:
Traceback (most recent call last):
  File "C:\Users\Saber\Anaconda3\lib\multiprocessing\process.py", line 258, in _bootstrap
    self.run()
  File "C:\Users\Saber\Anaconda3\lib\multiprocessing\process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\Saber\Anaconda3\lib\site-packages\reaver\envs\base\multiproc.py", line 52, in _run
    obs = self._env.reset()
  File "C:\Users\Saber\Anaconda3\lib\site-packages\reaver\envs\sc2.py", line 69, in reset
    obs, reward, done = self.obs_wrapper(self._env.reset())
  File "C:\Users\Saber\Anaconda3\lib\site-packages\reaver\envs\sc2.py", line 126, in __call__
    obs['feature_screen'][self.feature_masks['screen']],
  File "C:\Users\Saber\Anaconda3\lib\site-packages\pysc2\lib\named_array.py", line 145, in __getitem__
    index = _get_index(obj, index)
  File "C:\Users\Saber\Anaconda3\lib\site-packages\pysc2\lib\named_array.py", line 207, in _get_index
    "Can't index by type: %s; only int, string or slice" % type(index))
TypeError: Can't index by type: <class 'list'>; only int, string or slice

and I also got stuck in 'CartPole-v0'. Nothing will be shown after I have wait for quite a few moment and my code is:

import reaver as rvr
from multiprocessing import Process

if __name__ == '__main__':
    p = Process()
    p.start()
    env = rvr.envs.GymEnv('CartPole-v0')
    agent = rvr.agents.A2C(env.obs_spec(), env.act_spec())
    agent.run(env)

Any idea about this, Thanks!

inoryy commented 5 years ago

Hello,


What happens if you run python -m reaver.run --env CartPole-v0 --agent a2c?

King-Of-Knights commented 5 years ago

@inoryy Hey! For the first note, I follow your advice, but I still get Traceback:

Process Process-2:
Traceback (most recent call last):
  File "C:\Users\Saber\Anaconda3\lib\multiprocessing\process.py", line 258, in _bootstrap
    self.run()
  File "C:\Users\Saber\Anaconda3\lib\multiprocessing\process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\Saber\Anaconda3\lib\site-packages\reaver\envs\base\multiproc.py", line 47, in _run
    obs, rew, done = self._env.step(data)
  File "C:\Users\Saber\Anaconda3\lib\site-packages\reaver\envs\sc2.py", line 55, in step
    obs, reward, done = self.obs_wrapper(self._env.step(self.act_wrapper(action)))
  File "C:\Users\Saber\Anaconda3\lib\site-packages\pysc2\lib\stopwatch.py", line 201, in _stopwatch
    return func(*args, **kwargs)
  File "C:\Users\Saber\Anaconda3\lib\site-packages\pysc2\env\sc2_env.py", line 491, in step
    self._controllers, self._features, self._obs, actions))
  File "C:\Users\Saber\Anaconda3\lib\site-packages\pysc2\lib\run_parallel.py", line 54, in run
    funcs = [f if callable(f) else functools.partial(*f) for f in funcs]
  File "C:\Users\Saber\Anaconda3\lib\site-packages\pysc2\lib\run_parallel.py", line 54, in <listcomp>
    funcs = [f if callable(f) else functools.partial(*f) for f in funcs]
  File "C:\Users\Saber\Anaconda3\lib\site-packages\pysc2\env\sc2_env.py", line 490, in <genexpr>
    for c, f, o, a in zip(
  File "C:\Users\Saber\Anaconda3\lib\site-packages\pysc2\lib\stopwatch.py", line 201, in _stopwatch
    return func(*args, **kwargs)
  File "C:\Users\Saber\Anaconda3\lib\site-packages\pysc2\lib\features.py", line 1107, in transform_action
    func_id, func.name))
ValueError: Function 261/Halt_quick is currently not available

Which seems some action of marine is not supported yet. For second note, python -m reaver.run --env CartPole-v0 --agent a2c seems works well! Thanks!

inoryy commented 5 years ago

@King-Of-Knights ValueError: Function 261/Halt_quick is currently not available implies the agent attempted to use an action that isn't allowed for a given state - this is very weird.
Can you describe your environment? e.g. Python version, StarCraft version, Anaconda version, etc.

King-Of-Knights commented 5 years ago

The reason why I use multiprocessing.Process is that without multiprocessing.Process, I will get some annoying error like that:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\Saber\Anaconda3\lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "C:\Users\Saber\Anaconda3\lib\multiprocessing\spawn.py", line 114, in _main
    prepare(preparation_data)
  File "C:\Users\Saber\Anaconda3\lib\multiprocessing\spawn.py", line 225, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "C:\Users\Saber\Anaconda3\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
    run_name="__mp_main__")
  File "C:\Users\Saber\Anaconda3\lib\runpy.py", line 263, in run_path
    pkg_name=pkg_name, script_name=fname)
  File "C:\Users\Saber\Anaconda3\lib\runpy.py", line 96, in _run_module_code
    mod_name, mod_spec, pkg_name, script_name)
  File "C:\Users\Saber\Anaconda3\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\Users\Saber\Anaconda3\星际争霸-强化学习\1.py", line 7, in <module>
    agent.run(env)
  File "C:\Users\Saber\Anaconda3\lib\site-packages\reaver\agents\base\running.py", line 12, in run
    env.start()
  File "C:\Users\Saber\Anaconda3\lib\site-packages\reaver\envs\base\multiproc.py", line 72, in start
    env.start()
  File "C:\Users\Saber\Anaconda3\lib\site-packages\reaver\envs\base\multiproc.py", line 19, in start
    self.proc.start()
  File "C:\Users\Saber\Anaconda3\lib\multiprocessing\process.py", line 105, in start
    self._popen = self._Popen(self)
  File "C:\Users\Saber\Anaconda3\lib\multiprocessing\context.py", line 223, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "C:\Users\Saber\Anaconda3\lib\multiprocessing\context.py", line 322, in _Popen
    return Popen(process_obj)
  File "C:\Users\Saber\Anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 33, in __init__
    prep_data = spawn.get_preparation_data(process_obj._name)
  File "C:\Users\Saber\Anaconda3\lib\multiprocessing\spawn.py", line 143, in get_preparation_data
    _check_not_importing_main()
  File "C:\Users\Saber\Anaconda3\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main
    is not going to be frozen to produce an executable.''')
RuntimeError: 
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.

if I just directly use:

import reaver as rvr

env = rvr.envs.SC2Env(map_name='MoveToBeacon')
agent = rvr.agents.A2C(env.obs_spec(), env.act_spec(), rvr.models.build_fully_conv, rvr.models.SC2MultiPolicy, n_envs=4)
agent.run(env)
King-Of-Knights commented 5 years ago

@inoryy Hey, My Python Version is 3.6.5, Anaconda 5.2, OS Win10, CUDA 9 , I am not sure whether I find out the correct version code for Starcraft , it seems it is 4.7.1.7326. I could play online game so I guess I got the latest version. Maybe that is something unexpected for Win10?

inoryy commented 5 years ago

@King-Of-Knights seems to be a Windows issue. Try following the advice they give. Something like this:

import reaver as rvr

if __name__ == '__main__':   
    env = rvr.envs.SC2Env(map_name='MoveToBeacon')
    agent = rvr.agents.A2C(env.obs_spec(), env.act_spec(), rvr.models.build_fully_conv, 
    rvr.models.SC2MultiPolicy, n_envs=4)
    agent.run(env)

Unfortunately I think the bigger issue is with StarCraft II version - even latest master of PySC2 doesn't support it yet. I'll see if it's something I could fix on Reaver's end.

King-Of-Knights commented 5 years ago

@inoryy for your last advice, I still get:

Process Process-3:
Traceback (most recent call last):
  File "C:\Users\Saber\Anaconda3\lib\multiprocessing\process.py", line 258, in _bootstrap
    self.run()
  File "C:\Users\Saber\Anaconda3\lib\multiprocessing\process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\Saber\Anaconda3\lib\site-packages\reaver\envs\base\multiproc.py", line 47, in _run
    obs, rew, done = self._env.step(data)
  File "C:\Users\Saber\Anaconda3\lib\site-packages\reaver\envs\sc2.py", line 55, in step
    obs, reward, done = self.obs_wrapper(self._env.step(self.act_wrapper(action)))
  File "C:\Users\Saber\Anaconda3\lib\site-packages\pysc2\lib\stopwatch.py", line 201, in _stopwatch
    return func(*args, **kwargs)
  File "C:\Users\Saber\Anaconda3\lib\site-packages\pysc2\env\sc2_env.py", line 491, in step
    self._controllers, self._features, self._obs, actions))
  File "C:\Users\Saber\Anaconda3\lib\site-packages\pysc2\lib\run_parallel.py", line 54, in run
    funcs = [f if callable(f) else functools.partial(*f) for f in funcs]
  File "C:\Users\Saber\Anaconda3\lib\site-packages\pysc2\lib\run_parallel.py", line 54, in <listcomp>
    funcs = [f if callable(f) else functools.partial(*f) for f in funcs]
  File "C:\Users\Saber\Anaconda3\lib\site-packages\pysc2\env\sc2_env.py", line 490, in <genexpr>
    for c, f, o, a in zip(
  File "C:\Users\Saber\Anaconda3\lib\site-packages\pysc2\lib\stopwatch.py", line 201, in _stopwatch
    return func(*args, **kwargs)
  File "C:\Users\Saber\Anaconda3\lib\site-packages\pysc2\lib\features.py", line 1107, in transform_action
    func_id, func.name))
ValueError: Function 334/Patrol_minimap is currently not available

Okay, I will just stay tune and wait for your update 👍 . Btw, will Reaver support somthing like Deep SARSA, DQN, and DDPG in the future?

inoryy commented 5 years ago

@King-Of-Knights that error is good news in the sense that you no longer get the multiprocessing issue, only the version mismatch.


For now I have focused on policy gradients based methods as there is already a very good implementation for value based methods: Dopamine.

After 2.1 I might look into integrating Reaver with Dopamine to add value based methods that way.

inoryy commented 5 years ago

@King-Of-Knights actually, can you please try running python -m pysc2.bin.agent --map MoveToBeacon a few times and let me know if it crashes with a similar error?

King-Of-Knights commented 5 years ago

Sorry for the delay! Everything works fine with it!

inoryy commented 5 years ago

I've set up a Windows test bed and debugged the issue for a bit - the problem is neither in StarCraft II nor in PySC2, so good news, kind of!

I've narrowed the problem down to differences between Linux and Windows in multiprocessing. I'll close the issue and continue working on the problem in #17 to keep things organized.