oxwhirl / smac

SMAC: The StarCraft Multi-Agent Challenge
MIT License
1.07k stars 226 forks source link

RLlib + SMAC Example #1

Closed richardliaw closed 5 years ago

richardliaw commented 5 years ago

cc @ericl

ericl commented 5 years ago

Updated to break the example into separate files for clarity. I think this is probably good to merge; perf should presumably be similar to https://github.com/ray-project/ray/pull/3542 since the code is largely unchanged.

samvelyan commented 5 years ago

It seems like the total number of SC2 processes created during either run_ppo or run_qmix scripts is num-workers + 1, where one of the environments is not being used but takes extra memory. Any idea why this happens?

Could you please also state the additional dependencies in the README file, e.g. gym, tf?

ericl commented 5 years ago

Ah, the reason we create n+1 envs is that one is needed for the driver process. It seems like we need to create the StarCraft2 env there to get its action and observation space. However we never call any methods on the env.

Is it possible to defer creation of the StarCraft process until reset() is called? That would allow access to obs and action space without extra overheads.

ericl commented 5 years ago

Added TF dep in readme. The rllib package already includes the rest of the deps.

samvelyan commented 5 years ago

Done in 91761f1

ericl commented 5 years ago

Awesome, I think that should fix the extra memory use issue then.

samvelyan commented 5 years ago

This didn't solve it. Apparently, reset() is being called anyway for the first env instance which launches the SC2 process. You can make use of class methods to get action/observation/state spaces w/o calling the reset().

ericl commented 5 years ago

@samvelyan will be fixed in the next RLlib release https://github.com/ray-project/ray/pull/3810