huawei-noah / SMARTS

Scalable Multi-Agent RL Training School for Autonomous Driving
MIT License
954 stars 190 forks source link

[Help Request] Agent stuck nearby goal #2092

Closed l1xiao closed 1 year ago

l1xiao commented 1 year ago

High Level Description

Hi, I used e10 to train agent0(ego) and encountered a problem like following: image

Ego stopped(speed is 0.0) till env reach max episode step.

I want to know if this situation is due to incorrect goal settings or issues caused by training? I added extra item in reward function(examples/e10_drive/train/reward.py) as follow, but it failed to fix such situation:

if obs[agent_id]["events"]["reached_max_episode_steps"]:
    reward[agent_id] -= np.float64(10)
    print(f"{agent_id}: Reached max episode steps.")
    continue            
if agent_obs["ego_vehicle_state"]:
    reward[agent_id] += np.float64(agent_obs["ego_vehicle_state"]["speed"] * 0.01)

Version

1.4.0

Operating System

Ubuntu 18.04

Problems

Selected Scenarios: scenarios:

Adaickalavan commented 1 year ago

Hi @l1xiao,

  1. SMARTS/examples/e10_drive is the accompanying example for the driving towards a goal task Driving SMARTS 2023.1 & 2023.2. Feel free to read the task description for more information.

  2. Whereas, SMARTS/examples/e11_platoon is the accompanying example for the vehicle following task Driving SMARTS 2023.3. Feel free to read the task description for more information.

  3. The two tasks, namely

    • Driving SMARTS 2023.1 & 2023.2
    • Driving SMARTS 2023.3

    have different objectives and have different set of scenario files. They are not to be mixed. The agents are to be trained separately for the two tasks.

  4. In Driving SMARTS 2023.1 & 2023.2, the agent is tasked to drive towards a goal position. Here, the episode ends when the agent reaches the goal, when it crashes, or when it reaches max episode steps. But in Driving SMARTS 2023.3, the agent is tasked to follow a lead vehicle and the episode ends when the lead vehicle exits the map or when the agent crashes. Hence, there is no goal position in the task Driving SMARTS 2023.3 .

  5. Suitable training scenarios for the task Driving SMARTS 2023.1 & 2023.2 are given here, whereas suitable training scenarios for the task Driving SMARTS 2023.3 are given here.

  6. We see that you have mixed and matched the scenario files in SMARTS/examples/e10_drive/train/config.yaml as it erroneously includes vehicle following scenarios, namely

    • scenarios/sumo/vehicle_following/straight_2lane_sumo_agents_1
    • scenarios/sumo/vehicle_following/straight_3lanes_sumo_agents_1
  7. Training an agent for two different tasks using a single e10_drive example training code will lead to conflicting agent behaviour.

  8. We believe the screenshot above shows one of the vehicle following scenarios during training. Since there is no goal position here, the agent simply stalls by refusing to exit the map to avoid the reward penalty.

  9. Please try training e10_drive example using the default scenario files.

l1xiao commented 1 year ago

Thanks for your reply. You are right. I read the detail of scenarios/sumo/vehicle_following/straight_2lane_sumo_agents_1, and found that the mission of ego not containing a route. In scenarios/sumo/vehicle_following/straight_3lanes_sumo_agents_1/scenario.py:

ego_missions = [
    EndlessMission(
        begin=("E0", 1, 20),
        entry_tactic=TrapEntryTactic(
            start_time=1, wait_to_hijack_limit_s=0, default_entry_speed=0
        ),
    )  # Delayed start, to ensure road has prior traffic.
]