Closed Yu-zx closed 3 months ago
When updating the position of the intelligent body, can it be updated in the scene file, or does it need to use the step() function in the env? If I want to add some indicators, in addition to modifying the scene file, do I also need to modify the core env files, or do I only need to modify the scene file? Is there a detailed explanation of this issue? Thank you very much for taking the time to answer the question!
The size of the map is a 2 units square. You can choose the world-size here.
At every step a force of $\pm1$ is applied on the agent in the direction specified and it follows a double-integrator model for a dt=0.1 time.
The map is continuous and not grid-designed.
To update the position, use the step() function. The scene file (and the environment.py) is just to initialise the environment and to specify the physics.
What kind of indicators do you want to add?
So when modifying the scene file, if some modifications are added, the corresponding part of env may also need to be modified, right?
Ideally you won't have to change much in the environment file unless you are changing the signature of the function.
Can you give an example to illustrate this?
zheli Is the change in position of each step here not very obvious? Why is this? If I want to reasonably control the change of speed and position to meet the actual situation, where should I set it?
For example, how do I set my speed according to the size of the map, as well as the completion of the goal and the number of steps in the round? Do you have any suggestions?
You can define the states and velocities here.
The completion of goal can be set in done_callback()
in scene files.
The number of steps can be set by using the --episode_length
flag and changing it from the default value of 25 to anything you like. Note that it gets harder to train RL models to convergence with longer horizons.
How do I set the completion status of done_callback()? Is it based on the done function? In addition, should I reasonably evaluate my step size?
The done function is based on the done_callback()
. Every time env.step()
is called, this callback will be called internally to check if the episode is done or not. You can just return done=True
in it whenever your criteria for the completion of the episode is satisfied.
The done method is defined in the Scenario class in the navigation.py file as done_callback, right?
Is the action space here discrete or pseudo-discrete? I see that some algorithms used in reinforcement learning algorithms need to be continuous or pseudo-discrete. So is this the default setting? No additional parameters are required?
Yes, the done method is defined as done_callback()
in the scenario file eg. nagivation.py
The action space is discrete in our MPE environment. The code will automatically detect discrete action space and set the natural networks architecture to accommodate for discrete action space outputs.
If it is a discrete action space, is it mapped into continuous action processing later? So it can be used in maddpg or mappo. Are these algorithms that require continuous action space like this? Is my understanding correct? Or how to put it? Can you explain it in detail? Can you give an example?
Both MADDPG and MAPPO can use discrete actions. So there is no need to map discrete actions to continuous. Refer ActLayer
here to see how the network architecture is modified to accommodate for different action spaces.
I understand. The current action space is discrete, and the distance of each move is subject to dynamic constraints and maximum speed limits. Is this correct?
Yes, the distance travelled is dependent on the time step dt
and the acceleration force $\pm1$ applied.
OK, then are there any constraints in the scene? Can you give some examples? Also, are there any considerations for the wall to prevent it from hitting the wall?
There are no constraints in the scene. The only way to make the agents avoid collisions is through the reward function where the agents get a penalty for colliding with another entity. Refer here.
I would like to ask whether you would support me to consider the PID dynamic model, if I want to modify it to the PID dynamic model, or if you can briefly explain the modification.
If I modify the layout and map size in the scene file, add static obstacles, dynamic obstacles, and some related content, do I need to modify env.py?
You can modify the layout, map size, add obstacles, etc. in the scene file itself (eg., naviation_graph.py)
If you want to add controllable agents, you can add scripted_agents
in the scenario. But you will have to control them internally based on the PID state everytime env.step is called.
The first question: If I set the map size in the scene file, do I need to modify the world size in the core file to match it? If I add dynamic obstacles, do I need to add movement logic in env, or do I just need to set movable to True in the scene file?
Or, do I only need to modify the scene file, and don't need to modify the core and env files?
The second question is: What does controllable intelligent body mean? If I want to modify the model, do I need to affect this? I don't quite understand the second answer. Can you give an example to answer it?
The third question is, if the entity size is modified, do I need to add boundary detection, in the world or the scene file?
An example pulled from here
def step(self, action_n: List) -> Tuple[List, List, List, List]:
self.current_step += 1
obs_n = []
reward_n = []
done_n = []
info_n = []
self.world.current_time_step += 1
self.agents = self.world.policy_agents
# set action for each agent
for i, agent in enumerate(self.agents):
self._set_action(action_n[i], agent, self.action_space[i])
# advance world state
self.world.step()
# record observation for each agent
for agent in self.agents:
obs_n.append(self._get_obs(agent))
reward = self._get_reward(agent)
reward_n.append(reward)
done_n.append(self._get_done(agent))
info = {"individual_reward": reward}
env_info = self._get_info(agent)
info.update(env_info) # nothing fancy here, just appending dict to dict
info_n.append(info)
# control all scripted agents with PID
for i, agent in self.scripted_agents:
action_scripted = getPIDAction(agent)
self._set_action(action_scripted, agent, self.action_space[i])
# all agents get total reward in cooperative case
reward = np.sum(reward_n)
if self.shared_reward:
reward_n = [reward] * self.n
return obs_n, reward_n, done_n, info_n
The boundary decision is not implemented. There is nothing stopping the agent from moving outside the environment boundaries.
OK, I think your answer is very detailed. In addition, I would like you to add what is the general role of controllable intelligent body? More specifically, what can I do with this intelligent body? If possible, please give some relevant examples at the end.
You can control these scripted agents using a different algorithm. A few examples would be to use agents that just follow straight lines, or just move in circles, or do reciprocal obstacle avoidance (RVO), etc.
Closing due to inactivity. Please reopen if the issue still persists.
I would like to ask where I should set the boundary check so that it does not move outside the set boundaries?
What is the size unit of the map in the current scene design? What is the distance that the agent moves each time, and what is its unit? Is the current map continuous or grid-designed?