Farama-Foundation / HighwayEnv

A minimalist environment for decision-making in autonomous driving
https://highway-env.farama.org/
MIT License
2.64k stars 754 forks source link

multi-agent setting for the Parking environment #593

Closed Rza-A closed 6 months ago

Rza-A commented 7 months ago

Hello, Thank you for bringing us such a simulator with diverse settings!

I'm trying to run a multi-agent setting for the Parking environment. However, when I run the code I get the following error.

 File "/home/red/miniconda3/envs/highway/lib/python3.8/site-packages/highway_env/envs/parking_env.py", line 117, in _info
    success = tuple(self._is_success(agent_obs['achieved_goal'], agent_obs['desired_goal']) for agent_obs in obs)
  File "/home/red/miniconda3/envs/highway/lib/python3.8/site-packages/highway_env/envs/parking_env.py", line 117, in <genexpr>
    success = tuple(self._is_success(agent_obs['achieved_goal'], agent_obs['desired_goal']) for agent_obs in obs)
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

It seems the error is in the line. Because obs and agent_obs are Numpy arrays.

Here is my code.

import gymnasium as gym
import highway_env
import time

env = gym.make("parking-v0")

env.configure({
  "controlled_vehicles": 3,
  "vehicles_count": 1,
  'lanes_count': 1,
  "observation": {
    "type": "MultiAgentObservation",
    "observation_config": {
      "type": "Kinematics",
    }
  },
  "action": {
    "type": "MultiAgentAction",
    "action_config": {
      "type": "DiscreteMetaAction",
    }
  }
})
env.reset()
print(env.config)
done = False
env.render(mode = "human")

while not done:
  next_obs, reward, done, info = env.step(env.action_space.sample())
  print(reward, done, info)
  env.render(mode="human")
  time.sleep(1)

I'd appreciate any help on how to resolve this issue.

eleurent commented 6 months ago

Hi, This is because the ParkingEnv expects a KinematicsGoal observation rather than a Kinematics observation (this was required by StableBaselines3 to support HER, see first warning in the docs). This is how the env is configured by default, and if you want to replace it by a MultiAgent observation, you need to keep the wrapped "observation_config" of type KinematicsGoal.

Here is an example that should work:

import gymnasium as gym
import highway_env
import time

env = gym.make("parking-v0")

env.configure({
  "controlled_vehicles": 3,
  "vehicles_count": 1,
  'lanes_count': 1,
  "observation": {
    "type": "MultiAgentObservation",
    "observation_config": {
      "type": "KinematicsGoal",
      "features": ["x", "y", "vx", "vy", "cos_h", "sin_h"],
      "scales": [100, 100, 5, 5, 1, 1],
      "normalize": False,
    }
  },
  "action": {
    "type": "MultiAgentAction",
    "action_config": {
      "type": "DiscreteMetaAction",
    }
  }
})
env.reset()
print(env.config)
done = False
env.render()

while not done:
  next_obs, reward, done, truncated, info = env.step(env.action_space.sample())
  print(reward, done, info)
  env.render()
  time.sleep(1)
Rza-A commented 6 months ago

Thank you.

It seems there is a problem with the multi-agent setting. One or more of the vehicles are rendered/spawned outside the box, please check the attached image. Also, the controlled vehicles are trying to park in the same spot. Is this a correct behavior?

Screenshot from 2024-04-22 13-49-45

Best regards, Reza

eleurent commented 6 months ago

Hi, I added a small fix that improves the starting location of vehicles.

It's true that as of now, the multi-agent setting is a bit strange as it is a mix of adversarial (they all try to park in the same spot) and cooperative (when maximising the total reward = sum of agent rewards, i.e. they cooperate such that one vehicle is parked and other vehicles are as close as possible).

There should probably be one goal landmark per vehicle.