nsidn98 / InforMARL

Code for our paper: Scalable Multi-Agent Reinforcement Learning through Intelligent Information Aggregation
https://nsidn98.github.io/InforMARL/
MIT License
91 stars 22 forks source link

Have some questions about the scene? #19

Closed Yu-zx closed 3 months ago

Yu-zx commented 4 months ago

What is the size unit of the map in the current scene design? What is the distance that the agent moves each time, and what is its unit? Is the current map continuous or grid-designed?

Yu-zx commented 4 months ago

When updating the position of the intelligent body, can it be updated in the scene file, or does it need to use the step() function in the env? If I want to add some indicators, in addition to modifying the scene file, do I also need to modify the core env files, or do I only need to modify the scene file? Is there a detailed explanation of this issue? Thank you very much for taking the time to answer the question!

nsidn98 commented 4 months ago

The size of the map is a 2 units square. You can choose the world-size here.

At every step a force of $\pm1$ is applied on the agent in the direction specified and it follows a double-integrator model for a dt=0.1 time.

The map is continuous and not grid-designed.

To update the position, use the step() function. The scene file (and the environment.py) is just to initialise the environment and to specify the physics.

What kind of indicators do you want to add?

Yu-zx commented 4 months ago

So when modifying the scene file, if some modifications are added, the corresponding part of env may also need to be modified, right?

nsidn98 commented 4 months ago

Ideally you won't have to change much in the environment file unless you are changing the signature of the function.

Yu-zx commented 4 months ago

Can you give an example to illustrate this?

Yu-zx commented 4 months ago

image zheli Is the change in position of each step here not very obvious? Why is this? If I want to reasonably control the change of speed and position to meet the actual situation, where should I set it?

Yu-zx commented 4 months ago

For example, how do I set my speed according to the size of the map, as well as the completion of the goal and the number of steps in the round? Do you have any suggestions?

nsidn98 commented 4 months ago

You can define the states and velocities here.

The completion of goal can be set in done_callback() in scene files.

The number of steps can be set by using the --episode_length flag and changing it from the default value of 25 to anything you like. Note that it gets harder to train RL models to convergence with longer horizons.

Yu-zx commented 4 months ago

How do I set the completion status of done_callback()? Is it based on the done function? In addition, should I reasonably evaluate my step size?

nsidn98 commented 4 months ago

The done function is based on the done_callback(). Every time env.step() is called, this callback will be called internally to check if the episode is done or not. You can just return done=True in it whenever your criteria for the completion of the episode is satisfied.

Yu-zx commented 4 months ago

The done method is defined in the Scenario class in the navigation.py file as done_callback, right?

Yu-zx commented 4 months ago

Is the action space here discrete or pseudo-discrete? I see that some algorithms used in reinforcement learning algorithms need to be continuous or pseudo-discrete. So is this the default setting? No additional parameters are required?

nsidn98 commented 4 months ago

Yes, the done method is defined as done_callback() in the scenario file eg. nagivation.py

The action space is discrete in our MPE environment. The code will automatically detect discrete action space and set the natural networks architecture to accommodate for discrete action space outputs.

Yu-zx commented 4 months ago

If it is a discrete action space, is it mapped into continuous action processing later? So it can be used in maddpg or mappo. Are these algorithms that require continuous action space like this? Is my understanding correct? Or how to put it? Can you explain it in detail? Can you give an example?

nsidn98 commented 4 months ago

Both MADDPG and MAPPO can use discrete actions. So there is no need to map discrete actions to continuous. Refer ActLayer here to see how the network architecture is modified to accommodate for different action spaces.

Yu-zx commented 4 months ago

I understand. The current action space is discrete, and the distance of each move is subject to dynamic constraints and maximum speed limits. Is this correct?

nsidn98 commented 4 months ago

Yes, the distance travelled is dependent on the time step dt and the acceleration force $\pm1$ applied.

Yu-zx commented 4 months ago

OK, then are there any constraints in the scene? Can you give some examples? Also, are there any considerations for the wall to prevent it from hitting the wall?

nsidn98 commented 4 months ago

There are no constraints in the scene. The only way to make the agents avoid collisions is through the reward function where the agents get a penalty for colliding with another entity. Refer here.

Yu-zx commented 4 months ago

I would like to ask whether you would support me to consider the PID dynamic model, if I want to modify it to the PID dynamic model, or if you can briefly explain the modification.

Yu-zx commented 4 months ago

If I modify the layout and map size in the scene file, add static obstacles, dynamic obstacles, and some related content, do I need to modify env.py?

nsidn98 commented 4 months ago

You can modify the layout, map size, add obstacles, etc. in the scene file itself (eg., naviation_graph.py)

If you want to add controllable agents, you can add scripted_agents in the scenario. But you will have to control them internally based on the PID state everytime env.step is called.

Yu-zx commented 4 months ago

The first question: If I set the map size in the scene file, do I need to modify the world size in the core file to match it? If I add dynamic obstacles, do I need to add movement logic in env, or do I just need to set movable to True in the scene file?

Or, do I only need to modify the scene file, and don't need to modify the core and env files?

The second question is: What does controllable intelligent body mean? If I want to modify the model, do I need to affect this? I don't quite understand the second answer. Can you give an example to answer it?

The third question is, if the entity size is modified, do I need to add boundary detection, in the world or the scene file?

nsidn98 commented 3 months ago
  1. If you modify the world size in the scene file, it will automatically change the world size.
  2. If you add dynamic obstacles, you will need to add movement logic in the env.step() function. In there, you can have a routine to control the dynamic obstacles and another routine (already in-built) to control the agents.
  3. If you set movable to True, then they will follow the collision dynamics in the environment. So yes, you will have to set it to True.
  4. Generally, the core and env files don't need to be changed.
  5. A controllable intelligent body is an agent that you can control using external logic and do not rely on the underlying RL agent to control it.

An example pulled from here

def step(self, action_n: List) -> Tuple[List, List, List, List]:
    self.current_step += 1
    obs_n = []
    reward_n = []
    done_n = []
    info_n = []
    self.world.current_time_step += 1
    self.agents = self.world.policy_agents
    # set action for each agent
    for i, agent in enumerate(self.agents):
        self._set_action(action_n[i], agent, self.action_space[i])
    # advance world state
    self.world.step()
    # record observation for each agent
    for agent in self.agents:
        obs_n.append(self._get_obs(agent))
        reward = self._get_reward(agent)
        reward_n.append(reward)
        done_n.append(self._get_done(agent))
        info = {"individual_reward": reward}
        env_info = self._get_info(agent)
        info.update(env_info)  # nothing fancy here, just appending dict to dict
        info_n.append(info)
    # control all scripted agents with PID
    for i, agent in self.scripted_agents:
        action_scripted = getPIDAction(agent)
        self._set_action(action_scripted, agent, self.action_space[i])

    # all agents get total reward in cooperative case
    reward = np.sum(reward_n)
    if self.shared_reward:
        reward_n = [reward] * self.n

    return obs_n, reward_n, done_n, info_n

The boundary decision is not implemented. There is nothing stopping the agent from moving outside the environment boundaries.

Yu-zx commented 3 months ago

OK, I think your answer is very detailed. In addition, I would like you to add what is the general role of controllable intelligent body? More specifically, what can I do with this intelligent body? If possible, please give some relevant examples at the end.

nsidn98 commented 3 months ago

You can control these scripted agents using a different algorithm. A few examples would be to use agents that just follow straight lines, or just move in circles, or do reciprocal obstacle avoidance (RVO), etc.

nsidn98 commented 3 months ago

Closing due to inactivity. Please reopen if the issue still persists.

Yu-zx commented 3 months ago

I would like to ask where I should set the boundary check so that it does not move outside the set boundaries?