openai / multiagent-particle-envs

Code for a multi-agent particle environment used in the paper "Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments"
https://arxiv.org/pdf/1706.02275.pdf
MIT License
2.32k stars 785 forks source link

Compatibility with openai/baselines #34

Open mmmikael opened 5 years ago

mmmikael commented 5 years ago

Are those environments compatible with OpenAI baselines implementation?

At first sights, it looks like the agents in openai/baselines don't support environments with an observable list.

For example the code below gives the exception:

~/tmp/baselines/baselines/deepq/deepq.py in learn(env, network, seed, lr, total_timesteps, buffer_size, exploration_fraction, exploration_final_eps, train_freq, batch_size, print_freq, checkpoint_freq, checkpoint_path, learning_starts, gamma, target_network_update_freq, prioritized_replay, prioritized_replay_alpha, prioritized_replay_beta0, prioritized_replay_beta_iters, prioritized_replay_eps, param_noise, callback, load_path, **network_kwargs)
    202         make_obs_ph=make_obs_ph,
    203         q_func=q_func,
--> 204         num_actions=env.action_space.n,
    205         optimizer=tf.train.AdamOptimizer(learning_rate=lr),
    206         gamma=gamma,

AttributeError: 'list' object has no attribute 'n'

Code that instantiates a baseline agent with a multiagent environment:

from baselines.common.vec_env.subproc_vec_env import SubprocVecEnv
from baselines.run import get_learn_function

from multiagent.environment import MultiAgentEnv
import multiagent.scenarios as scenarios

common_kwargs = dict(total_timesteps=30000, network="mlp", gamma=1.0, seed=0)

learn_kwargs = {
    'a2c' : dict(nsteps=32, value_network='copy', lr=0.05),
    'acktr': dict(nsteps=32, value_network='copy'),
    'deepq': dict(total_timesteps=20000),
    'ppo2': dict(value_network='copy'),
    'trpo_mpi': {}
}
alg = "deepq"

kwargs = common_kwargs.copy()
kwargs.update(learn_kwargs[alg])
learn_fn = lambda e: get_learn_function(alg)(env=e, **kwargs)

def env_fn():
    scenario = scenarios.load("simple_tag.py").Scenario()
    world = scenario.make_world()
    env = MultiAgentEnv(world, scenario.reset_world, scenario.reward, scenario.observation, scenario.benchmark_data)
    return env

env = SubprocVecEnv([env_fn])
model = learn_fn(env)
Zamiell commented 5 years ago

Hello @mmmikael . I too want to use baselines in order to have the latest algorithms to use in my multi-agent machine learning project. Did you ever find a solution for this, or code a workaround?

Usmaniatech commented 5 years ago

Hi @mmmikael do share the solution if you are able to work with multi agent in open ai baselines

indhra commented 4 years ago

@mmmikael @Zamiell @Usmaniatech did any of you rectified the issue, if so do share the solution