google-deepmind / mujoco_mpc

Real-time behaviour synthesis with MuJoCo, using Predictive Control
https://github.com/deepmind/mujoco_mpc
Apache License 2.0
949 stars 141 forks source link

Spawn Multiple Agents in the same Scene #217

Closed Edu4000 closed 9 months ago

Edu4000 commented 9 months ago

Hello, I am a student interested in simulating a multiagent system inside mujoco. I was testing on running and controlling the Unitree A1 robodog using the examples in this repository. Mainly testing the ui_agent_server with the task_flat.xml file, while changing the position of the goal in runtime. However I have been trying to spawn more than one robodog one single simulation but to no avail.

My question the is whether it is possible to have two or more different agents a1 inside on single worldbody/model? Should I modify the task_flat.xml and add a second import of a1.xml or should I create a second agent inside the code?

Minimum working example of what I have so far ```Python import time as time_ import mujoco import mujoco.viewer as viewer from mujoco_mpc import agent as agent_lib import pathlib import numpy as np # Creating model and data m = mujoco.MjModel.from_xml_path("./tasks/quadruped/task_flat.xml") d = mujoco.MjData(m) # Initializing our agent (agent server/executable) agent = agent_lib.Agent( # This is to enable the ui server_binary_path=pathlib.Path(agent_lib.__file__).parent / "mjpc" / "ui_agent_server", task_id="Quadruped Flat", model=m) # TODO: Create more agents here? # weights agent.set_cost_weights({"Position": 0.15}) print("Cost weights:", agent.get_cost_weights()) # parameters agent.set_task_parameter("Walk speed", 2.0) print("Parameters:", agent.get_task_parameters()) goals = [ [5, 0, 0.26], [0, 5, 0.26], [-5, 0, 0.26], [0, -5, 0.26] ] i = 0 with mujoco.viewer.launch_passive(m, d) as viewer: # Close the viewer automatically after 30 wall-seconds. start = time_.time() while viewer.is_running() and time_.time(): # set planner state agent.set_state( time=d.time, qpos=d.qpos, qvel=d.qvel, act=d.act, mocap_pos=d.mocap_pos, mocap_quat=d.mocap_quat, userdata=d.userdata, ) # TODO: Add more agents' states here? # run planner for num_steps num_steps = 8 for _ in range(num_steps): agent.planner_step() # set ctrl from agent policy d.ctrl = agent.get_action() # Goal Reposition Policy if (np.linalg.norm(d.mocap_pos[0] - d.body('trunk').xpos) < 1): print("\nARRIVED!") d.mocap_pos[0] = goals[i] i = (i + 1) % len(goals) mujoco.mj_step(m,d) # Pick up changes to the physics state, apply perturbations, update options from GUI. viewer.sync() ```

I have not understood yet all the forms in which I can interact with the agent_server or ui_agent_server and wondered if this can be done at all.

erez-tom commented 9 months ago

Under the hood, agent_lib spins a C++ process that's stateful (as opposed to the functional, stateless convention of Jax, for example): it remembers the model you initiated it with, the call planner_step mutates its internal state, and get_action only reads that internal state. The frontend of this process is a gRPC server that implements the messages described in agent.proto to communicate with this planner process.

Therefore, there's not an easy way to run multiple, independent agents. You can try to hack agent_lib to spin separate planning processes, with a separate gRPC channel for each. If you figure it out, please share your code!