Question regarding simultaneous agent server access

Iz-Idk commented 9 months ago

Hi, regarding this issue #217. We have been meaning to understand it more, however, we have run into a problem that involves the mocap_pos. From what I have understood, when the agent server is called it tries to find a free port to access it (which in both ui_agent_servers does connect to different ports at the same time). Eventhough I want to give different goals to each robot, each take the last goal that it was connected to or was recently updated to. Is it because the client from the grpc is using the same server? or is there anything you could recommend us? Here is an example code

Working example

``` import time as time_ import mujoco import mujoco.viewer as viewer from mujoco_mpc import agent as agent_lib import numpy as np import pathlib # Creating model and data m = mujoco.MjModel.from_xml_path("../../../../../mujoco_mpc/mjpc/tasks/quadruped/task_flat2.xml") d = mujoco.MjData(m) # Initializing our agent (agent server/executable) agent = agent_lib.Agent( # This is to enable the ui server_binary_path=pathlib.Path(agent_lib.__file__).parent / "mjpc" / "ui_agent_server", task_id="Quadruped Flat", model=m) agent2 = agent_lib.Agent( # Enable the ui and finds a freeport server_binary_path=pathlib.Path(agent_lib.__file__).parent / "mjpc" / "ui_agent_server", task_id="Quadruped Flat", model=m) ## Data for agents # weights agent.set_cost_weights({"Position": 0.15}) print("Cost weights:", agent.get_cost_weights()) # parameters agent.set_task_parameter("Walk speed", 1.0) print("Parameters:", agent.get_task_parameters()) # weights for second agent agent2.set_cost_weights({"Position": 0.15}) print("Cost weights:", agent.get_cost_weights()) # parameters for second agent agent2.set_task_parameter("Walk speed", 1.0) print("Parameters:", agent.get_task_parameters()) # run planner for num_steps num_steps = 8 for _ in range(num_steps): agent.planner_step() agent2.planner_step() # Multiple goals goals = [ [7, 0, 0.26], [0, -2, 0.26], [-2, 0, 0.26], [0, -5, 0.26] ] i = 0 goal_agent = [] goal_agent2 = [] # Settings different goals for the different robots goal_agent = d.mocap_pos goal_agent[0] = goals[0] print(goal_agent) goal_agent2 = d.mocap_pos goal_agent2[0] = goals[1] print(goal_agent) with mujoco.viewer.launch_passive(m, d) as viewer: # Close the viewer automatically after 30 wall-seconds. while viewer.is_running(): # set planner state agent.set_state( time=d.time, qpos=d.qpos[:19], qvel=d.qvel[:18], act=d.act, mocap_pos=goal_agent, mocap_quat=d.mocap_quat, ) # set planner state for agent 2 agent2.set_state( time=d.time, qpos=d.qpos[19:], qvel=d.qvel[18:], act=d.act, mocap_pos=goal_agent2, mocap_quat=d.mocap_quat, ) print(d.mocap_pos) # Seems to take only in consideration the second agents goal # Obtain their joints from each agent and display them on the robots agent_ctrl = agent.get_action() agent_ctrl2 = agent2.get_action() agent_ctrl = np.append(agent_ctrl, agent_ctrl2, axis=0) d.ctrl = agent_ctrl # To detect when the robot is close to the goal if (np.linalg.norm(d.mocap_pos[0] - d.body('trunk').xpos) < 1) or (np.linalg.norm(d.mocap_pos[0] - d.body('trunk2').xpos) < 1) : goal_agent2[0] = goals[i] # Only updates 1 agent's goal print("\nARRIVED!") i = (i + 1) % len(goals) mujoco.mj_step(m,d) # Example modification of a viewer option: toggle contact points every two seconds. with viewer.lock(): viewer.opt.flags[mujoco.mjtVisFlag.mjVIS_CONTACTPOINT] = int(d.time % 2) # Pick up changes to the physics state, apply perturbations, update options from GUI. viewer.sync() ```

A few notes are that the task_flat2 contains 2 robots with different spawning points and they both go to a same goal. And we have also tried putting both goals to no avail. Here is a video example Video Thank you for your time and sorry for the trouble.

thowell commented 7 months ago

Cool project!

I took a look at the code you provided and didn't immediately see any problems. Can you provide a link to: task_flat2.xml? This way I can try and replicate the issue on my end.

Iz-Idk commented 7 months ago

Hi thank you for your answer! The code for the task_flat2.xml is

task_flat2.xml

``` ```

There were a bit of modifications to the a1.xml too, so I will also put it here.

for the a1.xml

``` ```

Thank you again for your help :)

thowell commented 7 months ago

A couple things to look into:

the keyframe is incorrect. the nominal quaternion should be 1 0 0 0 for each quadruped.
the main script doesn't call planner_step in the main loop, this routine should to be called at least once for each quadruped before a simulation step.

If you think the mocap goal position is the issue, I recommend calling get_state after set_state to ensure that the mocap input to the agent was valid.

google-deepmind / mujoco_mpc

Question regarding simultaneous agent server access #227