metadriverse / scenarionet

ScenarioNet: Scalable Traffic Scenario Management System for Autonomous Driving
Apache License 2.0
142 stars 21 forks source link

sim problem in Google Colab and Mac #21

Closed huyuening closed 1 month ago

huyuening commented 9 months ago

Command: !python -m scenarionet.sim -d /content/exp_converted/ --render 3D

Result: Known pipe types: glxGraphicsPipe (1 aux display modules not yet loaded.) :ShowBase(warning): Unable to open 'onscreen' window. Traceback (most recent call last): File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/usr/local/lib/python3.10/dist-packages/scenarionet/sim.py", line 52, in env.reset(seed=index if args.scenario_index is None else args.scenario_index) File "/usr/local/lib/python3.10/dist-packages/metadrive/envs/base_env.py", line 467, in reset self.lazy_init() # it only works the first time when reset() is called to avoid the error when render File "/usr/local/lib/python3.10/dist-packages/metadrive/envs/base_env.py", line 353, in lazy_init initialize_engine(self.config) File "/usr/local/lib/python3.10/dist-packages/metadrive/engine/engine_utils.py", line 12, in initialize_engine cls.singleton = cls(env_global_config) File "/usr/local/lib/python3.10/dist-packages/metadrive/engine/base_engine.py", line 36, in init EngineCore.init(self, global_config) File "/usr/local/lib/python3.10/dist-packages/metadrive/engine/core/engine_core.py", line 171, in init super(EngineCore, self).init(windowType=self.mode) File "/usr/local/lib/python3.10/dist-packages/direct/showbase/ShowBase.py", line 341, in init self.openDefaultWindow(startDirect = False, props=props) File "/usr/local/lib/python3.10/dist-packages/direct/showbase/ShowBase.py", line 1026, in openDefaultWindow self.openMainWindow(*args, *kw) File "/usr/local/lib/python3.10/dist-packages/direct/showbase/ShowBase.py", line 1061, in openMainWindow self.openWindow(args, **kw) File "/usr/local/lib/python3.10/dist-packages/direct/showbase/ShowBase.py", line 806, in openWindow raise Exception('Could not open window.') Exception: Could not open window.

Whether colab can run? Thanks.

QuanyiLi commented 9 months ago

The scenarionet/sim.py requires a screen for your machine to render the 3D scenarios, while colab machines don't have any video output devices and thus can not run this script. But if your notebook is running locally with a screen, of course, you can launch this script.

But you can still visualize the scenarios in colab :) A good workaround is to use the 2D pygame renderer, which don't need Xserver/screen and so on. Then you can save the frames to a GIF once you finished an episode and play that GIF.

An example is at: https://colab.research.google.com/github/metadriverse/metadrive/blob/main/metadrive/examples/Basic_MetaDrive_Usages.ipynb There is a section called Real-world Scenario Environment Visualization. All you need to pay attention is to use frame=env.render( mode="top_down", **extra_args ) to save a frame. and use

import pygame
import numpy as np
from PIL import Image

imgs = [pygame.surfarray.array3d(frame) for frame in frames]
imgs = [Image.fromarray(img) for img in imgs]
imgs[0].save("demo.gif", save_all=True, append_images=imgs[1:], duration=50, loop=0)
print("\nOpen gif...")
from IPython.display import Image
Image(open("demo.gif", 'rb').read())

to generate the GIF.

Thanks for your feedback. I should make this more clear and will add a colab example to this repo soon.

QuanyiLi commented 9 months ago

I made an colab example here: https://colab.research.google.com/github/metadriverse/scenarionet/blob/main/tutorial/simulation.ipynb

Feel free to give me any feedback!

huyuening commented 9 months ago

When I followed the Colab example, It worked. Thank you very much!

New problem In ScenarioNet document, i only see the visualization. But i want to hijack a vehicle (e.g. AV), and using the algorithm to control this vehicle, what can i do?

i think the ScenarioNet document don't show the process.

QuanyiLi commented 9 months ago

If you wanna control the vehicle. Just remove the config "agent_policy" from the dict. After that, the agent policy will restore to the default ExternalInputPolicy which uses the input of env.step() to set the throttle or steering for the ego vehicle. The input dict is a two-dim vector [throttle, steering]. The values for both dims should be in the range [-1, 1]. Thus, for example, you can use env.step([0,1]) to make the car move forward.

huyuening commented 9 months ago

This look like a simple control, just throttle and steering, and without perception and decision. I hope using the trained autonomous driving algorithm. In your example, ScenarioNet with ROS or OpenPilot can achieve this goal?

QuanyiLi commented 9 months ago

Well, it depends on how to build your autonomous driving system (ADS). Basically, an ADS is a mapping or function from image/lidar/imu to throttle/steering. The env.step() will return observation which contains image/lidar/imu data for the input of ADS. Then your ADS should produce [throttle, steering], which will be fed into the next env.step().

The pseudo-code is like:

my_ADS = ADS()
o,_ = env.reset()
for i in range(max_episode_len):
    action=my_ADS.compute_action(o)
    o, r, d, t, i =env.step(action)
    if d:
        break

Therefore, the decision should happen in the my_ADS.compute_action(o). You can make it as complex as the openpilot or as simple as an end-to-end RL policy. But even for the complex openpilot controller, it still follows the decision-making procedure above taking image as input and output throttle/steering.

huyuening commented 9 months ago

Thanks for the answer. Is it possible to provide a simple end-to-end RL policy example in the documentation for easier understanding?

QuanyiLi commented 9 months ago

I cannot document too many details on training/desiging at this time. Sorry about it.

But we do include an end-to-end driving policy in the simulator. The source code is at https://github.com/metadriverse/metadrive/blob/main/metadrive/examples/ppo_expert/numpy_expert.py The policy is a 3-layer MLP trained with a huge amount of data. It takes 240 pseudo lidar points, IMU, and navigation info as input and output throttle and steering.

To experience this policy, just run python -m metadrive.examples.drive_in_single_agent_env The autopilot mode means the car is controlled by the end2end policy.

huyuening commented 9 months ago

I get it.

huyuening commented 5 months ago

In addition to the Google Colab, the MacBook Air (Apple M1) has the similar problem. What should I do? Thanks.

1. python -m scenarionet.sim -d /path/to/exp_converted --render 3D

:ShowBase(warning): Unable to open 'onscreen' window. Traceback (most recent call last): File "/Users/huyuening/opt/anaconda3/envs/scenarionet/lib/python3.9/runpy.py", line 197, in _run_module_as_main return _run_code(code, main_globals, None, File "/Users/huyuening/opt/anaconda3/envs/scenarionet/lib/python3.9/runpy.py", line 87, in _run_code exec(code, run_globals) File "/Users/huyuening/mdsn/scenarionet/scenarionet/sim.py", line 52, in env.reset(seed=index if args.scenario_index is None else args.scenario_index) File "/Users/huyuening/mdsn/metadrive/metadrive/envs/base_env.py", line 557, in reset self.lazy_init() # it only works the first time when reset() is called to avoid the error when render File "/Users/huyuening/mdsn/metadrive/metadrive/envs/base_env.py", line 433, in lazy_init initialize_engine(self.config) File "/Users/huyuening/mdsn/metadrive/metadrive/engine/engine_utils.py", line 12, in initialize_engine cls.singleton = cls(env_global_config) File "/Users/huyuening/mdsn/metadrive/metadrive/engine/base_engine.py", line 55, in init EngineCore.init(self, global_config) File "/Users/huyuening/mdsn/metadrive/metadrive/engine/core/engine_core.py", line 183, in init super(EngineCore, self).init(windowType=self.mode) File "/Users/huyuening/opt/anaconda3/envs/scenarionet/lib/python3.9/site-packages/direct/showbase/ShowBase.py", line 341, in init self.openDefaultWindow(startDirect = False, props=props) File "/Users/huyuening/opt/anaconda3/envs/scenarionet/lib/python3.9/site-packages/direct/showbase/ShowBase.py", line 1026, in openDefaultWindow self.openMainWindow(*args, *kw) File "/Users/huyuening/opt/anaconda3/envs/scenarionet/lib/python3.9/site-packages/direct/showbase/ShowBase.py", line 1061, in openMainWindow self.openWindow(args, **kw) File "/Users/huyuening/opt/anaconda3/envs/scenarionet/lib/python3.9/site-packages/direct/showbase/ShowBase.py", line 806, in openWindow raise Exception('Could not open window.') Exception: Could not open window.

2.python -m scenarionet.sim -d /path/to/exp_converted --render advanced

[!!!] RenderPipeline Sorry, your GPU does not support compute shaders! Make sure you have the latest drivers. If you already have, your gpu might be too old, or you might be using the open source drivers on linux.

QuanyiLi commented 5 months ago

Hi Yuening,

Sorry about that. It is actually a known issue that Mac with the M-series chips can not launch the 3D rendering service. A workaround is still using the top-down renderer.

Quanyi

huyuening commented 5 months ago

I cannot document too many details on training/desiging at this time. Sorry about it.

But we do include an end-to-end driving policy in the simulator. The source code is at https://github.com/metadriverse/metadrive/blob/main/metadrive/examples/ppo_expert/numpy_expert.py The policy is a 3-layer MLP trained with a huge amount of data. It takes 240 pseudo lidar points, IMU, and navigation info as input and output throttle and steering.

To experience this policy, just run python -m metadrive.examples.drive_in_single_agent_env The autopilot mode means the car is controlled by the end2end policy.

Currently, ppo_expert is applied as an example only in drive_in_single_agent_env on MetaDrive. However, my goal is to implement this policy in converted Waymo scenarios (by ScenarioNet) and control the ego car (self-driving car) in each scenario. Due to my limited capacity, I cannot accomplish this process by myself. Can you give me some help?

QuanyiLi commented 5 months ago

The ScenarioEnv is compatible with any reinforcement learning framework. I recommend setting the number of scenarios to 1 and using algorithms from stable-baselines3 to train your first policy in a single Waymo scene.

QuanyiLi commented 5 months ago

If you are familiar with Ray, you can build your training script based on this:

https://github.com/metadriverse/scenarionet/blob/main/scenarionet_training/scripts/train_waymo.py

huyuening commented 4 months ago

I tried a training demo based on stable baselines3 in MetaDrive document (https://metadrive-simulator.readthedocs.io/en/latest/training.html), and I met some troubles.

import gymnasium as gym
import matplotlib.pyplot as plt
import os

from functools import partial
from IPython.display import clear_output
from IPython.display import Image
from metadrive.envs import MetaDriveEnv
from metadrive.envs import ScenarioEnv
from metadrive.utils import generate_gif
from stable_baselines3 import PPO
from stable_baselines3.common.monitor import Monitor
from stable_baselines3.common.vec_env.subproc_vec_env import SubprocVecEnv

num_scenarios = 50000

def waymo_env(need_monitor=False):
    env = ScenarioEnv(
        dict(
            # manual_control=False,
            # reactive_traffic=False,
            # use_render=False,
            data_directory="/content/drive/MyDrive/exp_converted",
            num_scenarios=num_scenarios
        )
    )
    if need_monitor:
        env = Monitor(env)
    return env

# 8 subprocess to rollout
train_env=SubprocVecEnv([partial(waymo_env, True) for _ in range(8)])
# train_env=waymo_env()

model = PPO("MlpPolicy",
            train_env,
            n_steps=4096,
            verbose=1)
model.learn(total_timesteps=25_000 if os.getenv('TEST_DOC') else 300_000,
            log_interval=4)
# model.learn(total_timesteps=300_000, progress_bar=True)
train_env.close()
model.save("/content/drive/MyDrive/Autonomous_Driving_Algorithm_Waymo")

clear_output()
print("Training is finished! Generate gif ...")

# model.load("/content/drive/MyDrive/Autonomous_Driving_Algorithm_Waymo")

# evaluation
for seed in range(num_scenarios):
    try:
        total_reward = 0
        env=waymo_env()
        obs, _ = env.reset(seed=seed)
        for i in range(1000):
            action, _states = model.predict(obs, deterministic=True)
            obs, reward, done, _, info = env.step(action)
            total_reward += reward
            ret = env.render(mode="topdown",
                            screen_record=True,
                            window=False,
                            film_size=(1200, 1200)
                            # screen_size=(600, 600),
                            # camera_position=(50, 50)
                            )
            if done:
                print("episode_reward", total_reward)
                break

        env.top_down_renderer.generate_gif("scenario_{}.gif".format(seed))
    finally:
        env.close()
print("gif generation is finished ...")
  1. When I selected train_env=SubprocVecEnv([partial(waymo_env, True) for _ in range(8)])
    
    EOFError                                  Traceback (most recent call last)
    [<ipython-input-16-f442555c378c>](https://localhost:8080/#) in <cell line: 39>()
     37             n_steps=4096,
     38             verbose=1)
    ---> 39 model.learn(total_timesteps=25_000 if os.getenv('TEST_DOC') else 300_000,
     40             log_interval=4)
     41 # model.learn(total_timesteps=300_000, progress_bar=True)

8 frames /usr/lib/python3.10/multiprocessing/connection.py in _recv(self, size, read) 381 if n == 0: 382 if remaining == size: --> 383 raise EOFError 384 else: 385 raise OSError("got end of file during message")

EOFError:

2. When I selected `train_env=waymo_env()`

ValueError Traceback (most recent call last) in <cell line: 39>() 37 n_steps=4096, 38 verbose=1) ---> 39 model.learn(total_timesteps=25_000 if os.getenv('TEST_DOC') else 300_000, 40 log_interval=4) 41 # model.learn(total_timesteps=300_000, progress_bar=True)

17 frames /usr/local/lib/python3.10/dist-packages/metadrive/utils/vertex.py in is_anticlockwise(points) 64 n = len(points) 65 for i in range(n): ---> 66 x1, y1 = points[i] 67 x2, y2 = points[(i + 1) % n] # The next point, wrapping around to the first 68 sum += (x2 - x1) * (y2 + y1)

ValueError: too many values to unpack (expected 2)


3. Did you use this training demo in converted Waymo Open Motion Dataset? When I used a small number of scenarios as training set, It worked. But the training/evaluation result is relatively poor. The target vehicle couldn't follow the reference trajectory. Could you give me some suggestions?
QuanyiLi commented 4 months ago

I have no idea about question 1. But for question, it seems a problem of metadrive. Please set show_crosswalk and show_sidewalk as False to see if it is fixed For the last problem, you have to increase the number of samples, 300_000 is not enough. Generally, 1 million is the minimum requirement. If you use PPO, the total number of steps should be increased to 10 million.

huyuening commented 4 months ago

Yes, I set show_crosswalk=False and show_sidewalk=False, the question 2 is solved.

I choose training parameters:

num_scenarios = 70,000
total_timestamps = 10,000,000

PPO_training The figure shows the training process. Can I consider the training has achieved good results after 4 million timesteps. What is the meaning of some key indicators in the log, for example, why can ep_len_mean and ep_rew_mean reach quite large values?

QuanyiLi commented 4 months ago

Yeah, the reward is pretty high. You can visualize the scenario to see if it works well.

QuanyiLi commented 4 months ago

By the way, the bug in the second problem should be fixed already. Could you pull the latest MetaDrive and enable show_sidewalk and show_crosswalk to see if it still happens?

huyuening commented 4 months ago

I think the default PPO algorithm designing is not suitable for Waymo dataset.

  1. The trained scenario sometimes don't show up completely.

scenario_0 (7)

  1. The waymo dataset totals 20 seconds, but the trained scenarios will exceed 20 seconds, which may cause the reward to increase all the time.

scenario_1 (8)

  1. ......

In your paper, the algorithm is only applied to the nuPlan and PG datasets. So, can you adjust the reward function and termination conditions to fit the Waymo dataset?

huyuening commented 4 months ago

By the way, the bug in the second problem should be fixed already. Could you pull the latest MetaDrive and enable show_sidewalk and show_crosswalk to see if it still happens?

The bug seems to be solved.

QuanyiLi commented 4 months ago

For the first problem, there is a key map_region_size in env_config which may address this issue by assigning it a larger value such as 1024. Also, if you are using topdown renderer, the clipping brought by film_size may result in this as well. A larger film size may address this as well. Please refer to https://metadrive-simulator.readthedocs.io/en/latest/top_down_render.html for more details.

For the second problem, you can set horizion=300 or so on to terminate the environment so the environment step and reward won't increase forever.

For the third problem, I believe the reward function and termination condition can be generalized to the Waymo dataset. The problem that you can not get a good result could be

  1. The traffic can not react to ego car, which results in unreasonable collisions. Turn on reactive traffic to enable reactive traffic.
  2. The algorithm parameter may not be appropriate. Please refer to the settings here https://github.com/metadriverse/scenarionet/blob/main/scenarionet_training/scripts/train_waymo.py
huyuening commented 4 months ago

Thanks. I'll try later.

I also tried train_waymo.py for convenience, but I ran into problems with insufficient memory, so how could I reduce memory usage?

2024-02-14 03:26:27,873 ERROR trial_runner.py:567 -- Trial MultiWorkerPPO_GymEnvWrapper_306d0_00001: Error processing event.
Traceback (most recent call last):
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/tune/trial_runner.py", line 515, in _process_trial
    result = self.trial_executor.fetch_result(trial)
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/tune/ray_trial_executor.py", line 488, in fetch_result
    result = ray.get(trial_future[0], timeout=DEFAULT_GET_TIMEOUT)
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/worker.py", line 1428, in get
    raise value.as_instanceof_cause()
ray.exceptions.RayTaskError: ray::MultiWorkerPPO.train() (pid=12716, ip=172.28.0.12)
  File "python/ray/_raylet.pyx", line 484, in ray._raylet.execute_task
  File "python/ray/_raylet.pyx", line 438, in ray._raylet.execute_task.function_executor
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/rllib/agents/trainer.py", line 516, in train
    raise e
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/rllib/agents/trainer.py", line 505, in train
    result = Trainable.train(self)
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/tune/trainable.py", line 336, in train
    result = self.step()
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/rllib/agents/trainer_template.py", line 134, in step
    res = next(self.train_exec_impl)
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/util/iter.py", line 756, in __next__
    return next(self.built_iterator)
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/util/iter.py", line 843, in apply_filter
    for item in it:
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/util/iter.py", line 843, in apply_filter
    for item in it:
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:
  [Previous line repeated 1 more time]
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/util/iter.py", line 876, in apply_flatten
    for item in it:
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/util/iter.py", line 828, in add_wait_hooks
    item = next(it)
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:
  [Previous line repeated 1 more time]
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/util/iter.py", line 471, in base_iterator
    yield ray.get(futures, timeout=timeout)
ray.exceptions.RayActorError: The actor died unexpectedly before finishing this task.
QuanyiLi commented 4 months ago
  1. The memory for each MetaDrive instance depends on the num of scenarios. So reduce the num of scenarios in the training and evaluation environment will lead to less memory usage.
  2. As we are using PPO and ray to do parallel data sampling and evaluation, thus each trial will launch num_workers + num_evaluation workers MetaDrive instance. Thus, using less workers can achieve this as well.
  3. That script launches 5 experiments concurrently, which means there will be in total 5 * (num_workers + num_evaluation workers) MetaDrive instance in your system. The last way is to run less experiments for example, 1 experiment.
huyuening commented 4 months ago

I have some problems with the PPO algorithm training:

  1. it seems strange that no result appears in the first 21500s, and then output the result every 250s;
  2. An error is reported during the training process: ValueError: Summary file is not found at /content/drive/MyDrive/mdsn/scenarionet/dataset/waymo_test/dataset_summary.pkl!
    
    /content/drive/MyDrive/mdsn/scenarionet
    WARNING:tensorflow:From /usr/local/envs/scenarionet/lib/python3.8/site-packages/tensorflow/python/compat/v2_compat.py:107: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
    Instructions for updating:
    non-resource variables are not supported in the long term
    Successfully initialize Ray!
    Available resources:  {'memory': 552.0, 'CPU': 8.0, 'object_store_memory': 190.0, 'node:172.28.0.12': 1.0}
    We are using this wandb key file:  /content/drive/MyDrive/mdsn/scenarionet/scenarionet_training/wandb_utils/wandb_api_key_file.txt
    == Status ==
    Memory usage on this node: 7.1/51.0 GiB
    Using FIFO scheduling algorithm.
    Resources requested: 5.8/8 CPUs, 0/0 GPUs, 0.0/26.95 GiB heap, 0.0/9.28 GiB objects
    Result logdir: /content/drive/MyDrive/mdsn/scenarionet/experiment/TEST
    Number of trials: 1 (1 RUNNING)
    +------------------------------------------+----------+-------+--------+
    | Trial name                               | status   | loc   |   seed |
    |------------------------------------------+----------+-------+--------|
    | MultiWorkerPPO_GymEnvWrapper_aa36c_00000 | RUNNING  |       |      0 |
    +------------------------------------------+----------+-------+--------+

wandb: Currently logged in as: deep-learning-for-av (use wandb login --relogin to force relogin) wandb: wandb version 0.16.3 is available! To upgrade, please run: wandb: $ pip install wandb --upgrade wandb: Tracking run with wandb version 0.12.1 wandb: Syncing run TEST_aa36c_00000 wandb: View project at https://wandb.ai/deep-learning-for-av/scenarionet wandb: View run at https://wandb.ai/deep-learning-for-av/scenarionet/runs/aa36c_00000 wandb: Run data is saved locally in /content/drive/MyDrive/mdsn/scenarionet/wandb/run-20240215_055421-aa36c_00000 wandb: Run wandb offline to turn off syncing.

== Status == Memory usage on this node: 44.1/51.0 GiB Using FIFO scheduling algorithm. Resources requested: 5.8/8 CPUs, 0/0 GPUs, 0.0/26.95 GiB heap, 0.0/9.28 GiB objects Result logdir: /content/drive/MyDrive/mdsn/scenarionet/experiment/TEST Number of trials: 1 (1 RUNNING) +------------------------------------------+----------+-------------------+--------+--------+------------------+-------+----------+-----------+------------+----------+------------+----------+---------+ | Trial name | status | loc | seed | iter | total time (s) | ts | reward | success | coverage | out | max_step | length | level | |------------------------------------------+----------+-------------------+--------+--------+------------------+-------+----------+-----------+------------+----------+------------+----------+---------| | MultiWorkerPPO_GymEnvWrapper_aa36c_00000 | RUNNING | 172.28.0.12:17417 | 0 | 1 | 20426.5 | 52000 | 4.78439 | 0.980995 | 0.0533757 | 0.399108 | 0.0150164 | 11.7529 | 6.15673 | +------------------------------------------+----------+-------------------+--------+--------+------------------+-------+----------+-----------+------------+----------+------------+----------+---------+

== Status == Memory usage on this node: 44.2/51.0 GiB Using FIFO scheduling algorithm. Resources requested: 5.8/8 CPUs, 0/0 GPUs, 0.0/26.95 GiB heap, 0.0/9.28 GiB objects Result logdir: /content/drive/MyDrive/mdsn/scenarionet/experiment/TEST Number of trials: 1 (1 RUNNING) +------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+----------+------------+----------+---------+ | Trial name | status | loc | seed | iter | total time (s) | ts | reward | success | coverage | out | max_step | length | level | |------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+----------+------------+----------+---------| | MultiWorkerPPO_GymEnvWrapper_aa36c_00000 | RUNNING | 172.28.0.12:17417 | 0 | 2 | 20694.8 | 104000 | -3.02698 | 0.45509 | 0.108865 | 0.137725 | 0.407186 | 303.275 | 13 | +------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+----------+------------+----------+---------+

== Status == Memory usage on this node: 44.5/51.0 GiB Using FIFO scheduling algorithm. Resources requested: 5.8/8 CPUs, 0/0 GPUs, 0.0/26.95 GiB heap, 0.0/9.28 GiB objects Result logdir: /content/drive/MyDrive/mdsn/scenarionet/experiment/TEST Number of trials: 1 (1 RUNNING) +------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+---------+------------+----------+---------+ | Trial name | status | loc | seed | iter | total time (s) | ts | reward | success | coverage | out | max_step | length | level | |------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+---------+------------+----------+---------| | MultiWorkerPPO_GymEnvWrapper_aa36c_00000 | RUNNING | 172.28.0.12:17417 | 0 | 3 | 21006.7 | 156000 | 0.437018 | 0.540404 | 0.113145 | 0.10101 | 0.358586 | 260.495 | 13 | +------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+---------+------------+----------+---------+

== Status == Memory usage on this node: 44.6/51.0 GiB Using FIFO scheduling algorithm. Resources requested: 5.8/8 CPUs, 0/0 GPUs, 0.0/26.95 GiB heap, 0.0/9.28 GiB objects Result logdir: /content/drive/MyDrive/mdsn/scenarionet/experiment/TEST Number of trials: 1 (1 RUNNING) +------------------------------------------+----------+-------------------+--------+--------+------------------+--------+-----------+-----------+------------+----------+------------+----------+---------+ | Trial name | status | loc | seed | iter | total time (s) | ts | reward | success | coverage | out | max_step | length | level | |------------------------------------------+----------+-------------------+--------+--------+------------------+--------+-----------+-----------+------------+----------+------------+----------+---------| | MultiWorkerPPO_GymEnvWrapper_aa36c_00000 | RUNNING | 172.28.0.12:17417 | 0 | 4 | 21267.8 | 208000 | -0.167582 | 0.471204 | 0.115123 | 0.157068 | 0.371728 | 281.602 | 13 | +------------------------------------------+----------+-------------------+--------+--------+------------------+--------+-----------+-----------+------------+----------+------------+----------+---------+

== Status == Memory usage on this node: 44.8/51.0 GiB Using FIFO scheduling algorithm. Resources requested: 5.8/8 CPUs, 0/0 GPUs, 0.0/26.95 GiB heap, 0.0/9.28 GiB objects Result logdir: /content/drive/MyDrive/mdsn/scenarionet/experiment/TEST Number of trials: 1 (1 RUNNING) +------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+----------+------------+----------+---------+ | Trial name | status | loc | seed | iter | total time (s) | ts | reward | success | coverage | out | max_step | length | level | |------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+----------+------------+----------+---------| | MultiWorkerPPO_GymEnvWrapper_aa36c_00000 | RUNNING | 172.28.0.12:17417 | 0 | 5 | 21518.1 | 260000 | -1.1381 | 0.494845 | 0.115148 | 0.164948 | 0.340206 | 266.696 | 13 | +------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+----------+------------+----------+---------+

== Status == Memory usage on this node: 44.8/51.0 GiB Using FIFO scheduling algorithm. Resources requested: 5.8/8 CPUs, 0/0 GPUs, 0.0/26.95 GiB heap, 0.0/9.28 GiB objects Result logdir: /content/drive/MyDrive/mdsn/scenarionet/experiment/TEST Number of trials: 1 (1 RUNNING) +------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+----------+------------+----------+---------+ | Trial name | status | loc | seed | iter | total time (s) | ts | reward | success | coverage | out | max_step | length | level | |------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+----------+------------+----------+---------| | MultiWorkerPPO_GymEnvWrapper_aa36c_00000 | RUNNING | 172.28.0.12:17417 | 0 | 6 | 21793 | 312000 | -2.14413 | 0.494624 | 0.115176 | 0.129032 | 0.376344 | 281.011 | 13 | +------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+----------+------------+----------+---------+

== Status == Memory usage on this node: 44.9/51.0 GiB Using FIFO scheduling algorithm. Resources requested: 5.8/8 CPUs, 0/0 GPUs, 0.0/26.95 GiB heap, 0.0/9.28 GiB objects Result logdir: /content/drive/MyDrive/mdsn/scenarionet/experiment/TEST Number of trials: 1 (1 RUNNING) +------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+----------+------------+----------+---------+ | Trial name | status | loc | seed | iter | total time (s) | ts | reward | success | coverage | out | max_step | length | level | |------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+----------+------------+----------+---------| | MultiWorkerPPO_GymEnvWrapper_aa36c_00000 | RUNNING | 172.28.0.12:17417 | 0 | 7 | 22076.6 | 364000 | 0.174395 | 0.494792 | 0.115156 | 0.161458 | 0.34375 | 268.167 | 13 | +------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+----------+------------+----------+---------+

== Status == Memory usage on this node: 44.9/51.0 GiB Using FIFO scheduling algorithm. Resources requested: 5.8/8 CPUs, 0/0 GPUs, 0.0/26.95 GiB heap, 0.0/9.28 GiB objects Result logdir: /content/drive/MyDrive/mdsn/scenarionet/experiment/TEST Number of trials: 1 (1 RUNNING) +------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+----------+------------+----------+---------+ | Trial name | status | loc | seed | iter | total time (s) | ts | reward | success | coverage | out | max_step | length | level | |------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+----------+------------+----------+---------| | MultiWorkerPPO_GymEnvWrapper_aa36c_00000 | RUNNING | 172.28.0.12:17417 | 0 | 8 | 22358.1 | 416000 | -2.15311 | 0.469274 | 0.11516 | 0.145251 | 0.385475 | 291.536 | 13 | +------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+----------+------------+----------+---------+

== Status == Memory usage on this node: 45.0/51.0 GiB Using FIFO scheduling algorithm. Resources requested: 5.8/8 CPUs, 0/0 GPUs, 0.0/26.95 GiB heap, 0.0/9.28 GiB objects Result logdir: /content/drive/MyDrive/mdsn/scenarionet/experiment/TEST Number of trials: 1 (1 RUNNING) +------------------------------------------+----------+-------------------+--------+--------+------------------+--------+-----------+-----------+------------+----------+------------+----------+---------+ | Trial name | status | loc | seed | iter | total time (s) | ts | reward | success | coverage | out | max_step | length | level | |------------------------------------------+----------+-------------------+--------+--------+------------------+--------+-----------+-----------+------------+----------+------------+----------+---------| | MultiWorkerPPO_GymEnvWrapper_aa36c_00000 | RUNNING | 172.28.0.12:17417 | 0 | 9 | 22664 | 468000 | -0.265232 | 0.535211 | 0.115132 | 0.169014 | 0.295775 | 243.624 | 13 | +------------------------------------------+----------+-------------------+--------+--------+------------------+--------+-----------+-----------+------------+----------+------------+----------+---------+

== Status == Memory usage on this node: 45.0/51.0 GiB Using FIFO scheduling algorithm. Resources requested: 5.8/8 CPUs, 0/0 GPUs, 0.0/26.95 GiB heap, 0.0/9.28 GiB objects Result logdir: /content/drive/MyDrive/mdsn/scenarionet/experiment/TEST Number of trials: 1 (1 RUNNING) +------------------------------------------+----------+-------------------+--------+--------+------------------+--------+------------+-----------+------------+---------+------------+----------+---------+ | Trial name | status | loc | seed | iter | total time (s) | ts | reward | success | coverage | out | max_step | length | level | |------------------------------------------+----------+-------------------+--------+--------+------------------+--------+------------+-----------+------------+---------+------------+----------+---------| | MultiWorkerPPO_GymEnvWrapper_aa36c_00000 | RUNNING | 172.28.0.12:17417 | 0 | 10 | 22937.6 | 520000 | -0.0915441 | 0.450867 | 0.115205 | 0.17341 | 0.375723 | 304.231 | 13 | +------------------------------------------+----------+-------------------+--------+--------+------------------+--------+------------+-----------+------------+---------+------------+----------+---------+

== Status == Memory usage on this node: 45.1/51.0 GiB Using FIFO scheduling algorithm. Resources requested: 5.8/8 CPUs, 0/0 GPUs, 0.0/26.95 GiB heap, 0.0/9.28 GiB objects Result logdir: /content/drive/MyDrive/mdsn/scenarionet/experiment/TEST Number of trials: 1 (1 RUNNING) +------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+----------+------------+----------+---------+ | Trial name | status | loc | seed | iter | total time (s) | ts | reward | success | coverage | out | max_step | length | level | |------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+----------+------------+----------+---------| | MultiWorkerPPO_GymEnvWrapper_aa36c_00000 | RUNNING | 172.28.0.12:17417 | 0 | 11 | 23208.4 | 572000 | -2.13014 | 0.539604 | 0.115127 | 0.163366 | 0.29703 | 249.723 | 13 | +------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+----------+------------+----------+---------+

== Status == Memory usage on this node: 45.1/51.0 GiB Using FIFO scheduling algorithm. Resources requested: 5.8/8 CPUs, 0/0 GPUs, 0.0/26.95 GiB heap, 0.0/9.28 GiB objects Result logdir: /content/drive/MyDrive/mdsn/scenarionet/experiment/TEST Number of trials: 1 (1 RUNNING) +------------------------------------------+----------+-------------------+--------+--------+------------------+--------+-----------+-----------+------------+----------+------------+----------+---------+ | Trial name | status | loc | seed | iter | total time (s) | ts | reward | success | coverage | out | max_step | length | level | |------------------------------------------+----------+-------------------+--------+--------+------------------+--------+-----------+-----------+------------+----------+------------+----------+---------| | MultiWorkerPPO_GymEnvWrapper_aa36c_00000 | RUNNING | 172.28.0.12:17417 | 0 | 12 | 23482.6 | 624000 | -0.258691 | 0.455056 | 0.115216 | 0.174157 | 0.370787 | 298.253 | 13 | +------------------------------------------+----------+-------------------+--------+--------+------------------+--------+-----------+-----------+------------+----------+------------+----------+---------+

== Status == Memory usage on this node: 45.2/51.0 GiB Using FIFO scheduling algorithm. Resources requested: 5.8/8 CPUs, 0/0 GPUs, 0.0/26.95 GiB heap, 0.0/9.28 GiB objects Result logdir: /content/drive/MyDrive/mdsn/scenarionet/experiment/TEST Number of trials: 1 (1 RUNNING) +------------------------------------------+----------+-------------------+--------+--------+------------------+--------+-----------+-----------+------------+----------+------------+----------+---------+ | Trial name | status | loc | seed | iter | total time (s) | ts | reward | success | coverage | out | max_step | length | level | |------------------------------------------+----------+-------------------+--------+--------+------------------+--------+-----------+-----------+------------+----------+------------+----------+---------| | MultiWorkerPPO_GymEnvWrapper_aa36c_00000 | RUNNING | 172.28.0.12:17417 | 0 | 13 | 23745 | 676000 | -0.545498 | 0.527919 | 0.115121 | 0.147208 | 0.324873 | 257.33 | 13 | +------------------------------------------+----------+-------------------+--------+--------+------------------+--------+-----------+-----------+------------+----------+------------+----------+---------+

== Status == Memory usage on this node: 45.2/51.0 GiB Using FIFO scheduling algorithm. Resources requested: 5.8/8 CPUs, 0/0 GPUs, 0.0/26.95 GiB heap, 0.0/9.28 GiB objects Result logdir: /content/drive/MyDrive/mdsn/scenarionet/experiment/TEST Number of trials: 1 (1 RUNNING) +------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+-------+------------+----------+---------+ | Trial name | status | loc | seed | iter | total time (s) | ts | reward | success | coverage | out | max_step | length | level | |------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+-------+------------+----------+---------| | MultiWorkerPPO_GymEnvWrapper_aa36c_00000 | RUNNING | 172.28.0.12:17417 | 0 | 14 | 24023.6 | 728000 | -1.42022 | 0.48913 | 0.1152 | 0.125 | 0.38587 | 292.761 | 13 | +------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+-------+------------+----------+---------+

2024-02-15 12:39:54,398 ERROR trial_runner.py:567 -- Trial MultiWorkerPPO_GymEnvWrapper_aa36c_00000: Error processing event. Traceback (most recent call last): File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/tune/trial_runner.py", line 515, in _process_trial result = self.trial_executor.fetch_result(trial) File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/tune/ray_trial_executor.py", line 488, in fetch_result result = ray.get(trial_future[0], timeout=DEFAULT_GET_TIMEOUT) File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/worker.py", line 1428, in get raise value.as_instanceof_cause() ray.exceptions.RayTaskError(ValueError): ray::MultiWorkerPPO.train() (pid=17417, ip=172.28.0.12) File "python/ray/_raylet.pyx", line 484, in ray._raylet.execute_task File "python/ray/_raylet.pyx", line 438, in ray._raylet.execute_task.function_executor File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/rllib/agents/trainer.py", line 531, in train evaluation_metrics = self._evaluate() File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/rllib/agents/trainer.py", line 747, in _evaluate ray.get([ ray.exceptions.RayTaskError(ValueError): ray::RolloutWorker.sample() (pid=17730, ip=172.28.0.12) File "python/ray/_raylet.pyx", line 484, in ray._raylet.execute_task File "python/ray/_raylet.pyx", line 438, in ray._raylet.execute_task.function_executor File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/rllib/evaluation/rollout_worker.py", line 579, in sample batches = [self.input_reader.next()] File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/rllib/evaluation/sampler.py", line 93, in next batches = [self.get_data()] File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/rllib/evaluation/sampler.py", line 209, in get_data item = next(self.rollout_provider) File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/rllib/evaluation/sampler.py", line 548, in _env_runner base_env.poll() File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/rllib/env/base_env.py", line 325, in poll self.new_obs = self.vector_env.vector_reset() File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/rllib/env/vector_env.py", line 133, in vector_reset return [e.reset() for e in self.envs] File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/rllib/env/vector_env.py", line 133, in return [e.reset() for e in self.envs] File "/content/drive/MyDrive/mdsn/metadrive/metadrive/envs/gymwrapper.py", line 107, in reset obs, = self._inner.reset(**not_none_params) File "/content/drive/MyDrive/mdsn/metadrive/metadrive/envs/base_env.py", line 522, in reset self.lazy_init() # it only works the first time when reset() is called to avoid the error when render File "/content/drive/MyDrive/mdsn/metadrive/metadrive/envs/base_env.py", line 411, in lazy_init self.setup_engine() File "/content/drive/MyDrive/mdsn/metadrive/metadrive/envs/scenario_env.py", line 120, in setup_engine self.engine.register_manager("data_manager", ScenarioDataManager()) File "/content/drive/MyDrive/mdsn/metadrive/metadrive/manager/scenario_data_manager.py", line 36, in init self.summary_dict, self.summary_lookup, self.mapping = read_dataset_summary(self.directory) File "/content/drive/MyDrive/mdsn/metadrive/metadrive/scenario/utils.py", line 379, in read_dataset_summary raise ValueError(f"Summary file is not found at {summary_file}!") ValueError: Summary file is not found at /content/drive/MyDrive/mdsn/scenarionet/dataset/waymo_test/dataset_summary.pkl!

wandb: Waiting for W&B process to finish, PID 17595 wandb: Program ended successfully. wandb:
wandb: Find user logs for this run at: /content/drive/MyDrive/mdsn/scenarionet/wandb/run-20240215_055421-aa36c_00000/logs/debug.log wandb: Find internal logs for this run at: /content/drive/MyDrive/mdsn/scenarionet/wandb/run-20240215_055421-aa36c_00000/logs/debug-internal.log wandb: Run summary: wandb: episode_reward_max 9.55808 wandb: episode_reward_min -236.17933 wandb: episode_reward_mean -1.42022 wandb: episode_len_mean 292.76087 wandb: episodes_this_iter 184 wandb: num_healthy_workers 8 wandb: timesteps_total 728000 wandb: episodes_total 6716 wandb: training_iteration 14 wandb: timestamp 1708000507 wandb: time_this_iter_s 278.52195 wandb: time_total_s 24023.57016 wandb: time_since_restore 24023.57016 wandb: timesteps_since_restore 0 wandb: iterations_since_restore 14 wandb: success 0.48913 wandb: out 0.125 wandb: max_step 0.38587 wandb: level 13.0 wandb: length 292.76087 wandb: coverage 0.1152 wandb: custom_metrics/success_rate_mean 0.48913 wandb: custom_metrics/success_rate_min 0.0 wandb: custom_metrics/success_rate_max 1.0 wandb: custom_metrics/crash_rate_mean 0.0163 wandb: custom_metrics/crash_rate_min 0.0 wandb: custom_metrics/crash_rate_max 1.0 wandb: custom_metrics/out_of_road_rate_mean 0.125 wandb: custom_metrics/out_of_road_rate_min 0.0 wandb: custom_metrics/out_of_road_rate_max 1.0 wandb: custom_metrics/max_step_rate_mean 0.38587 wandb: custom_metrics/max_step_rate_min 0.0 wandb: custom_metrics/max_step_rate_max 1.0 wandb: custom_metrics/velocity_max_mean 1.45416 wandb: custom_metrics/velocity_max_min 0.00064 wandb: custom_metrics/velocity_max_max 8.61268 wandb: custom_metrics/velocity_mean_mean 0.29462 wandb: custom_metrics/velocity_mean_min 0.00064 wandb: custom_metrics/velocity_mean_max 4.31243 wandb: custom_metrics/velocity_min_mean 0.14046 wandb: custom_metrics/velocity_min_min 0.00049 wandb: custom_metrics/velocity_min_max 2.70274 wandb: custom_metrics/lateral_dist_min_mean -0.12593 wandb: custom_metrics/lateral_dist_min_min -2.00977 wandb: custom_metrics/lateral_dist_min_max 0.01272 wandb: custom_metrics/lateral_dist_max_mean 0.52085 wandb: custom_metrics/lateral_dist_max_min -0.01561 wandb: custom_metrics/lateral_dist_max_max 2.11605 wandb: custom_metrics/lateral_dist_mean_mean 0.16722 wandb: custom_metrics/lateral_dist_mean_min -1.08002 wandb: custom_metrics/lateral_dist_mean_max 1.39205 wandb: custom_metrics/steering_max_mean 0.58435 wandb: custom_metrics/steering_max_min -1.0 wandb: custom_metrics/steering_max_max 1.0 wandb: custom_metrics/steering_mean_mean -0.0227 wandb: custom_metrics/steering_mean_min -1.0 wandb: custom_metrics/steering_mean_max 1.0 wandb: custom_metrics/steering_min_mean -0.55695 wandb: custom_metrics/steering_min_min -1.0 wandb: custom_metrics/steering_min_max 1.0 wandb: custom_metrics/acceleration_min_mean -0.58033 wandb: custom_metrics/acceleration_min_min -1.0 wandb: custom_metrics/acceleration_min_max 1.0 wandb: custom_metrics/acceleration_mean_mean -0.00917 wandb: custom_metrics/acceleration_mean_min -1.0 wandb: custom_metrics/acceleration_mean_max 1.0 wandb: custom_metrics/acceleration_max_mean 0.56098 wandb: custom_metrics/acceleration_max_min -1.0 wandb: custom_metrics/acceleration_max_max 1.0 wandb: custom_metrics/step_reward_max_mean 0.1152 wandb: custom_metrics/step_reward_max_min 0.0 wandb: custom_metrics/step_reward_max_max 0.96045 wandb: custom_metrics/step_reward_mean_mean 0.00689 wandb: custom_metrics/step_reward_mean_min -0.46142 wandb: custom_metrics/step_reward_mean_max 0.96045 wandb: custom_metrics/step_reward_min_mean -0.23528 wandb: custom_metrics/step_reward_min_min -2.0 wandb: custom_metrics/step_reward_min_max 0.96045 wandb: custom_metrics/cost_mean 2.27174 wandb: custom_metrics/cost_min 0.0 wandb: custom_metrics/cost_max 119.0 wandb: custom_metrics/num_crash_vehicle_mean 1.52174 wandb: custom_metrics/num_crash_vehicle_min 0.0 wandb: custom_metrics/num_crash_vehicle_max 119.0 wandb: custom_metrics/num_crash_object_mean 0.0 wandb: custom_metrics/num_crash_object_min 0.0 wandb: custom_metrics/num_crash_object_max 0.0 wandb: custom_metrics/num_crash_human_mean 0.625 wandb: custom_metrics/num_crash_human_min 0.0 wandb: custom_metrics/num_crash_human_max 24.0 wandb: custom_metrics/num_on_line_mean 0.0 wandb: custom_metrics/num_on_line_min 0.0 wandb: custom_metrics/num_on_line_max 0.0 wandb: custom_metrics/step_reward_lateral_mean -0.13162 wandb: custom_metrics/step_reward_lateral_min -0.69616 wandb: custom_metrics/step_reward_lateral_max 0.0 wandb: custom_metrics/step_reward_heading_mean -0.01812 wandb: custom_metrics/step_reward_heading_min -0.18567 wandb: custom_metrics/step_reward_heading_max 0.96814 wandb: custom_metrics/step_reward_action_smooth_mean 0.0 wandb: custom_metrics/step_reward_action_smooth_min 0.0 wandb: custom_metrics/step_reward_action_smooth_max 0.0 wandb: custom_metrics/route_completion_mean 0.19836 wandb: custom_metrics/route_completion_min -0.0096 wandb: custom_metrics/route_completion_max 1.02506 wandb: custom_metrics/curriculum_level_mean 13.0 wandb: custom_metrics/curriculum_level_min 13 wandb: custom_metrics/curriculum_level_max 13 wandb: custom_metrics/scenario_index_mean 5420.34239 wandb: custom_metrics/scenario_index_min 5203 wandb: custom_metrics/scenario_index_max 5598 wandb: custom_metrics/track_length_mean 19.71836 wandb: custom_metrics/track_length_min 0.25524 wandb: custom_metrics/track_length_max 50.07087 wandb: custom_metrics/num_stored_maps_mean 50.0 wandb: custom_metrics/num_stored_maps_min 50 wandb: custom_metrics/num_stored_maps_max 50 wandb: custom_metrics/scenario_difficulty_mean 53.01857 wandb: custom_metrics/scenario_difficulty_min 39.41887 wandb: custom_metrics/scenario_difficulty_max 63.16812 wandb: custom_metrics/data_coverage_mean 0.1152 wandb: custom_metrics/data_coverage_min 0.1148 wandb: custom_metrics/data_coverage_max 0.116 wandb: custom_metrics/curriculum_success_mean 0.49522 wandb: custom_metrics/curriculum_success_min 0.38 wandb: custom_metrics/curriculum_success_max 0.56 wandb: custom_metrics/curriculum_route_completion_mean 0.18768 wandb: custom_metrics/curriculum_route_completion_min 0.15351 wandb: custom_metrics/curriculum_route_completion_max 0.25077 wandb: sampler_perf/mean_env_wait_ms 233.91001 wandb: sampler_perf/mean_raw_obs_processing_ms 8.99504 wandb: sampler_perf/mean_inference_ms 2.92219 wandb: sampler_perf/mean_action_processing_ms 0.26235 wandb: timers/sample_time_ms 234864.467 wandb: timers/sample_throughput 221.404 wandb: timers/load_time_ms 83.92 wandb: timers/load_throughput 619634.743 wandb: timers/learn_time_ms 40402.186 wandb: timers/learn_throughput 1287.059 wandb: timers/update_time_ms 6.98 wandb: info/num_steps_sampled 728000 wandb: info/num_steps_trained 728000 wandb: config/num_workers 8 wandb: config/num_envs_per_worker 1 wandb: config/rollout_fragment_length 500 wandb: config/num_gpus 0 wandb: config/train_batch_size 50000 wandb: config/gamma 0.99 wandb: config/horizon 600 wandb: config/soft_horizon False wandb: config/no_done_at_end False wandb: config/normalize_actions False wandb: config/clip_actions True wandb: config/lr 0.0001 wandb: config/monitor False wandb: config/ignore_worker_failures False wandb: config/log_sys_usage True wandb: config/fake_sampler False wandb: config/eager_tracing False wandb: config/no_eager_on_workers False wandb: config/explore True wandb: config/evaluation_interval 15 wandb: config/evaluation_num_episodes 1000 wandb: config/in_evaluation False wandb: config/evaluation_num_workers 8 wandb: config/sample_async False wandb: config/_use_trajectory_view_api False wandb: config/synchronize_filters True wandb: config/compress_observations False wandb: config/collect_metrics_timeout 180 wandb: config/metrics_smoothing_episodes 10 wandb: config/remote_worker_envs False wandb: config/remote_env_batch_wait_ms 0 wandb: config/min_iter_time_s 0 wandb: config/timesteps_per_iteration 0 wandb: config/seed 0 wandb: config/num_cpus_per_worker 0.3 wandb: config/num_gpus_per_worker 0 wandb: config/num_cpus_for_driver 1 wandb: config/memory 0 wandb: config/object_store_memory 0 wandb: config/memory_per_worker 0 wandb: config/object_store_memory_per_worker 0 wandb: config/postprocess_inputs False wandb: config/shuffle_buffer_size 0 wandb: config/output_max_file_size 67108864 wandb: config/replay_sequence_length 1 wandb: config/use_critic True wandb: config/use_gae True wandb: config/lambda 1.0 wandb: config/kl_coeff 0.2 wandb: config/sgd_minibatch_size 200 wandb: config/shuffle_sequences True wandb: config/num_sgd_iter 20 wandb: config/vf_share_layers False wandb: config/vf_loss_coeff 1.0 wandb: config/entropy_coeff 0.0 wandb: config/clip_param 0.3 wandb: config/vf_clip_param 10.0 wandb: config/kl_target 0.01 wandb: config/simple_optimizer False wandb: config/_fake_gpus False wandb: perf/cpu_util_percent 77.02267 wandb: perf/ram_util_percent 88.78766 wandb: config/model/free_log_std False wandb: config/model/no_final_linear False wandb: config/model/vf_share_layers True wandb: config/model/use_lstm False wandb: config/model/max_seq_len 20 wandb: config/model/lstm_cell_size 256 wandb: config/model/lstm_use_prev_action_reward False wandb: config/model/_time_major False wandb: config/model/framestack True wandb: config/model/dim 84 wandb: config/model/grayscale False wandb: config/model/zero_mean True wandb: config/env_config/start_scenario_index 0 wandb: config/env_config/num_scenarios 40000 wandb: config/env_config/sequential_seed True wandb: config/env_config/curriculum_level 100 wandb: config/env_config/target_success_rate 0.8 wandb: config/env_config/reactive_traffic True wandb: config/env_config/no_static_vehicles True wandb: config/env_config/no_light True wandb: config/env_config/static_traffic_object True wandb: config/env_config/driving_reward 1 wandb: config/env_config/steering_range_penalty 0 wandb: config/env_config/heading_penalty 1 wandb: config/env_config/lateral_penalty 1.0 wandb: config/env_config/no_negative_reward True wandb: config/env_config/on_lane_line_penalty 0 wandb: config/env_config/crash_vehicle_penalty 2 wandb: config/env_config/crash_human_penalty 2 wandb: config/env_config/out_of_road_penalty 2 wandb: config/env_config/max_lateral_dist 2 wandb: config/tf_session_args/intra_op_parallelism_threads 2 wandb: config/tf_session_args/inter_op_parallelism_threads 2 wandb: config/tf_session_args/log_device_placement False wandb: config/tf_session_args/allow_soft_placement True wandb: config/local_tf_session_args/intra_op_parallelism_threads 8 wandb: config/local_tf_session_args/inter_op_parallelism_threads 8 wandb: info/learner/default_policy/cur_kl_coeff 0.2 wandb: info/learner/default_policy/cur_lr 0.0001 wandb: info/learner/default_policy/total_loss 18.57894 wandb: info/learner/default_policy/policy_loss -0.03165 wandb: info/learner/default_policy/vf_loss 18.60693 wandb: info/learner/default_policy/vf_explained_var 0.702 wandb: info/learner/default_policy/kl 0.01829 wandb: info/learner/default_policy/entropy 3.11301 wandb: info/learner/default_policy/entropy_coeff 0.0 wandb: config/evaluation_config/env_config/start_scenario_index 0 wandb: config/evaluation_config/env_config/num_scenarios 1000 wandb: config/evaluation_config/env_config/sequential_seed True wandb: config/evaluation_config/env_config/curriculum_level 1 wandb: config/tf_session_args/gpu_options/allow_growth True wandb: config/tf_session_args/device_count/CPU 1 wandb: config/logger_config/wandb/log_config True wandb: config/env_config/vehicle_config/side_detector/num_lasers 0 wandb: _runtime 24045 wandb: _timestamp 1708000507 wandb: _step 13 wandb: Run history: wandb: episode_reward_max ▁▁▁█▁▂▁▁▁█▁▁▁▁ wandb: episode_reward_min █▃▇▇▇▁▇▆▇▃▅█▇▃ wandb: episode_reward_mean █▁▄▄▃▂▄▂▃▄▂▃▃▂ wandb: episode_len_mean ▁█▇▇▇▇▇█▇█▇█▇█ wandb: episodes_this_iter █▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: num_healthy_workers ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: timesteps_total ▁▂▂▃▃▄▄▅▅▆▆▇▇█ wandb: episodes_total ▁▁▂▃▃▄▄▅▅▆▆▇▇█ wandb: training_iteration ▁▂▂▃▃▄▄▅▅▆▆▇▇█ wandb: timestamp ▁▂▂▃▃▄▄▅▅▆▆▇▇█ wandb: time_this_iter_s █▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: time_total_s ▁▂▂▃▃▄▄▅▅▆▆▇▇█ wandb: time_since_restore ▁▂▂▃▃▄▄▅▅▆▆▇▇█ wandb: timesteps_since_restore ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: iterations_since_restore ▁▂▂▃▃▄▄▅▅▆▆▇▇█ wandb: success █▁▂▁▂▂▂▁▂▁▂▁▂▂ wandb: out █▂▁▂▃▂▂▂▃▃▂▃▂▂ wandb: max_step ▁█▇▇▇▇▇█▆▇▆▇▇█ wandb: level ▁█████████████ wandb: length ▁█▇▇▇▇▇█▇█▇█▇█ wandb: coverage ▁▇████████████ wandb: custom_metrics/success_rate_mean █▁▂▁▂▂▂▁▂▁▂▁▂▂ wandb: custom_metrics/success_rate_min ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: custom_metrics/success_rate_max ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: custom_metrics/crash_rate_mean ▁▇▃█▆▆▄█▇▃▅▆▅▄ wandb: custom_metrics/crash_rate_min ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: custom_metrics/crash_rate_max ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: custom_metrics/out_of_road_rate_mean █▂▁▂▃▂▂▂▃▃▂▃▂▂ wandb: custom_metrics/out_of_road_rate_min ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: custom_metrics/out_of_road_rate_max ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: custom_metrics/max_step_rate_mean ▁█▇▇▇▇▇█▆▇▆▇▇█ wandb: custom_metrics/max_step_rate_min ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: custom_metrics/max_step_rate_max ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: custom_metrics/velocity_max_mean ▁█▆█▇▇▆▇▇▇▇▇▇▆ wandb: custom_metrics/velocity_max_min ▁▄▁▅▂▃▃▄▁█▃▅▂▁ wandb: custom_metrics/velocity_max_max ▇▄▄▆▁▆▂▅█▆▄▄▄▄ wandb: custom_metrics/velocity_mean_mean ▁█▅█▇▇▆▇▇▇█▇▇▇ wandb: custom_metrics/velocity_mean_min ▁▄▁▅▂▃▃▄▁█▃▅▂▁ wandb: custom_metrics/velocity_mean_max ▃▃▁▂█▂▅▃▆▂▂▂▁▇ wandb: custom_metrics/velocity_min_mean ▄▇▁▆▇▅▂▄▅▅█▅▇▅ wandb: custom_metrics/velocity_min_min ▇▅▃█▅▄▇▃▆▁▆▆▄▇ wandb: custom_metrics/velocity_min_max ▅▂▁▁█▂▁▂▂▁▂▂▂▆ wandb: custom_metrics/lateral_dist_min_mean █▂▄▁▁▃▅▃▅▅▅▅▅▆ wandb: custom_metrics/lateral_dist_min_min ▆▂▇▁▆▃▄▃▇█▅▄▇█ wandb: custom_metrics/lateral_dist_min_max ▂▇▁▆▂▅▄█▅█▃▅▃▄ wandb: custom_metrics/lateral_dist_max_mean ▁▄▄▄▄▅▆▅▇█▇█▆▇ wandb: custom_metrics/lateral_dist_max_min ▄█▂▃█▅▄▆▄▅█▅▇▁ wandb: custom_metrics/lateral_dist_max_max ▁▇▂▃▂▇▁▁▇▄█▃▄▃ wandb: custom_metrics/lateral_dist_mean_mean ▃▃▃▁▂▄▆▄▇█▆▇▆▇ wandb: custom_metrics/lateral_dist_mean_min ▇▃▄▄▂▃▃▃▂▅▄▁▃█ wandb: custom_metrics/lateral_dist_mean_max ▆▃▅▆▄▄▅▁█▆▂▁▇▄ wandb: custom_metrics/steering_max_mean ▁█▇█▇▇▇▆▇█▇█▇█ wandb: custom_metrics/steering_max_min ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: custom_metrics/steering_max_max ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: custom_metrics/steering_mean_mean ▇▆▆█▇▅▅▁▇▅▄▅▄▅ wandb: custom_metrics/steering_mean_min ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: custom_metrics/steering_mean_max ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: custom_metrics/steering_min_mean █▂▃▂▂▂▂▁▃▂▃▂▃▂ wandb: custom_metrics/steering_min_min ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: custom_metrics/steering_min_max ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: custom_metrics/acceleration_min_mean █▂▂▂▂▁▁▁▂▁▃▂▂▂ wandb: custom_metrics/acceleration_min_min ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: custom_metrics/acceleration_min_max ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: custom_metrics/acceleration_mean_mean ▅▆▃▆█▃▁▅▆▆▇▆▅▅ wandb: custom_metrics/acceleration_mean_min ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: custom_metrics/acceleration_mean_max ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: custom_metrics/acceleration_max_mean ▁█▆▇█▇▆▇▇█▇█▇▇ wandb: custom_metrics/acceleration_max_min ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: custom_metrics/acceleration_max_max ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: custom_metrics/step_reward_max_mean █▆▁▅▂▄▃▃▂▃▄▃▃▂ wandb: custom_metrics/step_reward_max_min ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: custom_metrics/step_reward_max_max █▇▅▇▁▇▂▇▃▇▅▇▂▇ wandb: custom_metrics/step_reward_mean_mean █▁▂▁▁▂▂▁▁▂▁▂▁▂ wandb: custom_metrics/step_reward_mean_min ▅▄▆▄▆▂█▃▄▆▁▇▂▆ wandb: custom_metrics/step_reward_mean_max ██▁█▂█▁█▁█▁█▁█ wandb: custom_metrics/step_reward_min_mean █▁▃▂▂▃▂▁▃▃▃▃▂▂ wandb: custom_metrics/step_reward_min_min ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: custom_metrics/step_reward_min_max ██▁█▁█▁█▁█▁█▁█ wandb: custom_metrics/cost_mean ▁█▄▆▆▇▄▇▅▆▇▆▅▆ wandb: custom_metrics/cost_min ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: custom_metrics/cost_max ▁▆▂▂▂█▁▃▂▆▄▅▂▅ wandb: custom_metrics/num_crash_vehicle_mean ▁▇▃▅▄█▃▇▃▅▆▅▄▆ wandb: custom_metrics/num_crash_vehicle_min ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: custom_metrics/num_crash_vehicle_max ▂▅▂▂▃█▁▅▂▇▅▇▁▇ wandb: custom_metrics/num_crash_object_mean ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: custom_metrics/num_crash_object_min ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: custom_metrics/num_crash_object_max ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: custom_metrics/num_crash_human_mean ▁█▅▇▇▅▆▆▆▅▇▅▇▄ wandb: custom_metrics/num_crash_human_min ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: custom_metrics/num_crash_human_max ▂▄▆▆▄▆▅▄█▄▇▂▆▁ wandb: custom_metrics/num_on_line_mean ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: custom_metrics/num_on_line_min ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: custom_metrics/num_on_line_max ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: custom_metrics/step_reward_lateral_mean █▃▃▃▂▂▂▂▂▁▃▂▂▃ wandb: custom_metrics/step_reward_lateral_min ▅▇▇▅▃▆▆▅▁▅█▁▃█ wandb: custom_metrics/step_reward_lateral_max █▁████████████ wandb: custom_metrics/step_reward_heading_mean ▁█▇█▇█▇█▇█▇█▇█ wandb: custom_metrics/step_reward_heading_min ▁▇▁█▁▆▁▇▁▇▁█▁█ wandb: custom_metrics/step_reward_heading_max ██▁█▁█▁█▁█▁█▂█ wandb: custom_metrics/step_reward_action_smooth_mean ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: custom_metrics/step_reward_action_smooth_min ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: custom_metrics/step_reward_action_smooth_max ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: custom_metrics/route_completion_mean ▁█▇███▇█▇█████ wandb: custom_metrics/route_completion_min ▁█████████████ wandb: custom_metrics/route_completion_max █▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: custom_metrics/curriculum_level_mean ▁█████████████ wandb: custom_metrics/curriculum_level_min ▁█████████████ wandb: custom_metrics/curriculum_level_max ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: custom_metrics/scenario_index_mean ▁█████████████ wandb: custom_metrics/scenario_index_min ▁█████████████ wandb: custom_metrics/scenario_index_max ▁█████████████ wandb: custom_metrics/track_length_mean ▁█▆█▇▇▇▇▆▇▆█▆▇ wandb: custom_metrics/track_length_min ▁▇██▇█▇█▇█▇█▇█ wandb: custom_metrics/track_length_max ▁█▄▇█▇█▄▇█▇█▇▄ wandb: custom_metrics/num_stored_maps_mean ▁▁▆███████████ wandb: custom_metrics/num_stored_maps_min ▁▁▄▇██████████ wandb: custom_metrics/num_stored_maps_max █▁████████████ wandb: custom_metrics/scenario_difficulty_mean ▁█▇█▇█▇█▇█▇█▇█ wandb: custom_metrics/scenario_difficulty_min ▁█████████████ wandb: custom_metrics/scenario_difficulty_max ▁█████████████ wandb: custom_metrics/data_coverage_mean ▁▇████████████ wandb: custom_metrics/data_coverage_min ▁▇████████████ wandb: custom_metrics/data_coverage_max ▁▅████████████ wandb: custom_metrics/curriculum_success_mean ▆▁▆███████████ wandb: custom_metrics/curriculum_success_min ▁▁▂▆██████████ wandb: custom_metrics/curriculum_success_max █▁▄▄▄▄▄▄▄▄▄▄▄▄ wandb: custom_metrics/curriculum_route_completion_mean ▁▄▆███████████ wandb: custom_metrics/curriculum_route_completion_min ▁█████████████ wandb: custom_metrics/curriculum_route_completion_max █▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: sampler_perf/mean_env_wait_ms █▄▃▂▂▂▂▁▁▁▁▁▁▁ wandb: sampler_perf/mean_raw_obs_processing_ms █▄▃▂▂▂▂▁▁▁▁▁▁▁ wandb: sampler_perf/mean_inference_ms █▃▃▂▁▁▁▁▂▂▂▂▂▂ wandb: sampler_perf/mean_action_processing_ms █▁▃▁▂▂▂▂▃▃▃▂▃▃ wandb: timers/sample_time_ms █▄▃▃▂▂▂▂▂▂▁▁▁▁ wandb: timers/sample_throughput ▁▁▁▁▁▁▁▂▂▂████ wandb: timers/load_time_ms █▄▃▂▂▂▂▂▂▁▁▁▁▁ wandb: timers/load_throughput ▁▃▄▅▆▆▆▆▆▇███▇ wandb: timers/learn_time_ms █▂▃▂▁▁▁▂▃▃▃▄▄▅ wandb: timers/learn_throughput ▁▇▆▇███▆▆▆▆▅▅▄ wandb: timers/update_time_ms █▄▃▂▂▁▁▁▁▂▁▁▁▂ wandb: info/num_steps_sampled ▁▂▂▃▃▄▄▅▅▆▆▇▇█ wandb: info/num_steps_trained ▁▂▂▃▃▄▄▅▅▆▆▇▇█ wandb: config/num_workers ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/num_envs_per_worker ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/rollout_fragment_length ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/num_gpus ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/train_batch_size ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/gamma ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/horizon ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/soft_horizon ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/no_done_at_end ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/normalize_actions ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/clip_actions ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/lr ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/monitor ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/ignore_worker_failures ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/log_sys_usage ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/fake_sampler ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/eager_tracing ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/no_eager_on_workers ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/explore ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/evaluation_interval ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/evaluation_num_episodes ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/in_evaluation ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/evaluation_num_workers ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/sample_async ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/_use_trajectory_view_api ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/synchronize_filters ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/compress_observations ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/collect_metrics_timeout ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/metrics_smoothing_episodes ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/remote_worker_envs ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/remote_env_batch_wait_ms ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/min_iter_time_s ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/timesteps_per_iteration ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/seed ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/num_cpus_per_worker ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/num_gpus_per_worker ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/num_cpus_for_driver ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/memory ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/object_store_memory ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/memory_per_worker ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/object_store_memory_per_worker ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/postprocess_inputs ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/shuffle_buffer_size ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/output_max_file_size ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/replay_sequence_length ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/use_critic ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/use_gae ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/lambda ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/kl_coeff ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/sgd_minibatch_size ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/shuffle_sequences ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/num_sgd_iter ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/vf_share_layers ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/vf_loss_coeff ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/entropy_coeff ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/clip_param ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/vf_clip_param ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/kl_target ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/simple_optimizer ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/_fake_gpus ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: perf/cpu_util_percent ▁█▇▇█▇▇▇▇▇▇▇██ wandb: perf/ram_util_percent █▁▂▂▃▃▃▃▄▄▄▄▄▅ wandb: config/model/free_log_std ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/model/no_final_linear ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/model/vf_share_layers ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/model/use_lstm ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/model/max_seq_len ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/model/lstm_cell_size ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/model/lstm_use_prev_action_reward ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/model/_time_major ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/model/framestack ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/model/dim ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/model/grayscale ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/model/zero_mean ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/env_config/start_scenario_index ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/env_config/num_scenarios ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/env_config/sequential_seed ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/env_config/curriculum_level ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/env_config/target_success_rate ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/env_config/reactive_traffic ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/env_config/no_static_vehicles ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/env_config/no_light ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/env_config/static_traffic_object ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/env_config/driving_reward ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/env_config/steering_range_penalty ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/env_config/heading_penalty ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/env_config/lateral_penalty ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/env_config/no_negative_reward ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/env_config/on_lane_line_penalty ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/env_config/crash_vehicle_penalty ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/env_config/crash_human_penalty ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/env_config/out_of_road_penalty ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/env_config/max_lateral_dist ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/tf_session_args/intra_op_parallelism_threads ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/tf_session_args/inter_op_parallelism_threads ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/tf_session_args/log_device_placement ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/tf_session_args/allow_soft_placement ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/local_tf_session_args/intra_op_parallelism_threads ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/local_tf_session_args/inter_op_parallelism_threads ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: info/learner/default_policy/cur_kl_coeff ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: info/learner/default_policy/cur_lr ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: info/learner/default_policy/total_loss ▂▆▃▃▂█▁▃▃▅▄▁▂▄ wandb: info/learner/default_policy/policy_loss █▇▇▆▅▇▅▃▃▄▃▁▁▁ wandb: info/learner/default_policy/vf_loss ▂▆▃▃▂█▁▃▃▅▄▁▂▄ wandb: info/learner/default_policy/vf_explained_var ▂▁▅▄▄▃▆▆▆▅▅█▇▆ wandb: info/learner/default_policy/kl ▂▄▁▂▁▃▁▅▂▅▇▅▅█ wandb: info/learner/default_policy/entropy ▁▄▃▄▄▅▄▄▅▆█▇██ wandb: info/learner/default_policy/entropy_coeff ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/evaluation_config/env_config/start_scenario_index ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/evaluation_config/env_config/num_scenarios ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/evaluation_config/env_config/sequential_seed ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/evaluation_config/env_config/curriculum_level ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/tf_session_args/gpu_options/allow_growth ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/tf_session_args/device_count/CPU ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/logger_config/wandb/log_config ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: config/env_config/vehicle_config/side_detector/num_lasers ▁▁▁▁▁▁▁▁▁▁▁▁▁▁ wandb: _runtime ▁▂▂▃▃▄▄▅▅▆▆▇▇█ wandb: _timestamp ▁▂▂▃▃▄▄▅▅▆▆▇▇█ wandb: _step ▁▂▂▃▃▄▄▅▅▆▆▇▇█ wandb: wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) wandb: wandb: Synced TEST_aa36c_00000: https://wandb.ai/deep-learning-for-av/scenarionet/runs/aa36c_00000 == Status == Memory usage on this node: 2.6/51.0 GiB Using FIFO scheduling algorithm. Resources requested: 0.0/8 CPUs, 0/0 GPUs, 0.0/26.95 GiB heap, 0.0/9.28 GiB objects Result logdir: /content/drive/MyDrive/mdsn/scenarionet/experiment/TEST Number of trials: 1 (1 ERROR) +------------------------------------------+----------+-------+--------+--------+------------------+--------+----------+-----------+------------+-------+------------+----------+---------+ | Trial name | status | loc | seed | iter | total time (s) | ts | reward | success | coverage | out | max_step | length | level | |------------------------------------------+----------+-------+--------+--------+------------------+--------+----------+-----------+------------+-------+------------+----------+---------| | MultiWorkerPPO_GymEnvWrapper_aa36c_00000 | ERROR | | 0 | 14 | 24023.6 | 728000 | -1.42022 | 0.48913 | 0.1152 | 0.125 | 0.38587 | 292.761 | 13 | +------------------------------------------+----------+-------+--------+--------+------------------+--------+----------+-----------+------------+-------+------------+----------+---------+ Number of errored trials: 1 +------------------------------------------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | Trial name | # failures | error file | |------------------------------------------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------| | MultiWorkerPPO_GymEnvWrapper_aa36c_00000 | 1 | /content/drive/MyDrive/mdsn/scenarionet/experiment/TEST/MultiWorkerPPO_GymEnvWrapper_aa36c_00000_0_seed=0_2024-02-15_05-54-21/error.txt | +------------------------------------------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------+

== Status == Memory usage on this node: 2.6/51.0 GiB Using FIFO scheduling algorithm. Resources requested: 0.0/8 CPUs, 0/0 GPUs, 0.0/26.95 GiB heap, 0.0/9.28 GiB objects Result logdir: /content/drive/MyDrive/mdsn/scenarionet/experiment/TEST Number of trials: 1 (1 ERROR) +------------------------------------------+----------+-------+--------+--------+------------------+--------+----------+-----------+------------+-------+------------+----------+---------+ | Trial name | status | loc | seed | iter | total time (s) | ts | reward | success | coverage | out | max_step | length | level | |------------------------------------------+----------+-------+--------+--------+------------------+--------+----------+-----------+------------+-------+------------+----------+---------| | MultiWorkerPPO_GymEnvWrapper_aa36c_00000 | ERROR | | 0 | 14 | 24023.6 | 728000 | -1.42022 | 0.48913 | 0.1152 | 0.125 | 0.38587 | 292.761 | 13 | +------------------------------------------+----------+-------+--------+--------+------------------+--------+----------+-----------+------------+-------+------------+----------+---------+ Number of errored trials: 1 +------------------------------------------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | Trial name | # failures | error file | |------------------------------------------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------| | MultiWorkerPPO_GymEnvWrapper_aa36c_00000 | 1 | /content/drive/MyDrive/mdsn/scenarionet/experiment/TEST/MultiWorkerPPO_GymEnvWrapper_aa36c_00000_0_seed=0_2024-02-15_05-54-21/error.txt | +------------------------------------------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------+

Traceback (most recent call last): File "scenarionet_training/scripts/train_waymo.py", line 83, in train( File "/content/drive/MyDrive/mdsn/scenarionet/scenarionet_training/train_utils/utils.py", line 166, in train analysis = tune.run( File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/tune/tune.py", line 427, in run raise TuneError("Trials did not complete", incomplete_trials) ray.tune.error.TuneError: ('Trials did not complete', [MultiWorkerPPO_GymEnvWrapper_aa36c_00000])

huyuening commented 4 months ago

I solved problem 1 (Because I forgot to add a test set), But problem2 still remained.

At the beginning of training (which probably lasted nearly a day), success rates, rewards, and scene length didn't seem to change significantly. Is this reasonable?

截屏2024-02-22 21 49 03 截屏2024-02-22 21 50 31 截屏2024-02-22 21 52 21 截屏2024-02-22 21 55 21
QuanyiLi commented 4 months ago

That's weird. It should at least show some improvement. It seems you are running PPO. Please:

  1. try tuning the training on a single scenario with a long ego-car trajectory. Some Waymo scenarios have a pretty short trajectory with only 5-10m displacement, which is too easy. I guess the scenario you are training has a very short trajectory, so it can have some success rate without moving forward. You can use the keyboard controller and control the car on your own with show_navi_mark=True and manual_control=True to follow the trajectory and verify this. Also, you are supposed to find that no rear collision happens as the reactive_traffic should be set to True.
  2. set no_traffic=True can remove the influence of the surrounding vehicles. Then the task is reduced to a simple trajectory following task. If it works in this way, we may need to investigate if something is wrong with the traffic.
  3. How many works are you using? Your sample efficiency is really low... You can find more statistics about the sampling time and evaluation time. I guess your evaluation takes too much time. As you are testing on one scenario, just set evaluation_num_episodes=1, evaluation_num_workers=1. If sampling take more time, you should increase the number of works.
huyuening commented 3 months ago

Recently, iI was possible to use a device with 256GB of RAM (Windows operating system). However, this presents some new problems. , How could I solve it?

python train_waymo.py --num-gpus 0
F0320 22:30:23.563987 21692  7620 raylet_client.cc:108]  Check failed: _s.ok() [RayletClient] Unable to register worker with raylet.: IOError: Ray cookie mismatch for received message. Received cookie: 68681728
*** Check failure stack trace: ***
    @   00007FFE9538174B  public: void __cdecl google::LogMessage::Flush(void) __ptr64
    @   00007FFE953804E2  public: __cdecl google::LogMessage::~LogMessage(void) __ptr64
    @   00007FFE953494F8  public: virtual __cdecl google::NullStreamFatal::~NullStreamFatal(void) __ptr64
    @   00007FFE951A231C  PyInit__raylet
    @   00007FFE9512BFE2  PyInit__raylet
    @   00007FFE95141EEC  PyInit__raylet
    @   00007FFE9512F36F  PyInit__raylet
    @   00007FFE95154937  PyInit__raylet
    @   00007FFE950C84C1  public: virtual void __cdecl google::LogSink::WaitTillSent(void) __ptr64
    @   00007FFE950EAAB7  public: virtual void __cdecl google::LogSink::WaitTillSent(void) __ptr64
    @   00007FFE950541C2  (unknown)
    @   00007FFEBEFB2892  _Py_CheckFunctionResult
    @   00007FFEBEFB4C8B  _PyEval_EvalFrameDefault
    @   00007FFEBEFAFFB8  _PyEval_EvalCodeWithName
    @   00007FFEBEFB2D16  _Py_CheckFunctionResult
    @   00007FFEBEFB50C2  _PyEval_EvalFrameDefault
    @   00007FFEBEFAFFB8  _PyEval_EvalCodeWithName
    @   00007FFEBEFB0F1F  _PyFunction_Vectorcall
    @   00007FFEBEF89E36  PyVectorcall_Call
    @   00007FFEBEF89CC7  PySequence_GetItem
    @   00007FFEBEFB5B05  _PyEval_EvalFrameDefault
    @   00007FFEBEFAFFB8  _PyEval_EvalCodeWithName
    @   00007FFEBEFB0F1F  _PyFunction_Vectorcall
    @   00007FFEBEF89E36  PyVectorcall_Call
    @   00007FFEBEF89CC7  PySequence_GetItem
    @   00007FFEBEFB5B05  _PyEval_EvalFrameDefault
    @   00007FFEBEFAFFB8  _PyEval_EvalCodeWithName
    @   00007FFEBEFB2D16  _Py_CheckFunctionResult
    @   00007FFEBEFB50C2  _PyEval_EvalFrameDefault
    @   00007FFEBEFAFFB8  _PyEval_EvalCodeWithName
    @   00007FFEBEF7A9BF  PyEval_EvalCodeEx
    @   00007FFEBEF7A91D  PyEval_EvalCode
F0320 22:30:46.976624 36936 21108 raylet_client.cc:54] Could not connect to socket tcp://127.0.0.1:64535
*** Check failure stack trace: ***
    @   00007FFE9538174B  public: void __cdecl google::LogMessage::Flush(void) __ptr64
    @   00007FFE953804E2  public: __cdecl google::LogMessage::~LogMessage(void) __ptr64
    @   00007FFE953494F8  public: virtual __cdecl google::NullStreamFatal::~NullStreamFatal(void) __ptr64
    @   00007FFE951A2ACB  PyInit__raylet
    @   00007FFE951A160C  PyInit__raylet
    @   00007FFE9512BFE2  PyInit__raylet
    @   00007FFE95141EEC  PyInit__raylet
    @   00007FFE9512F36F  PyInit__raylet
    @   00007FFE95154937  PyInit__raylet
    @   00007FFE950C84C1  public: virtual void __cdecl google::LogSink::WaitTillSent(void) __ptr64
    @   00007FFE950EAAB7  public: virtual void __cdecl google::LogSink::WaitTillSent(void) __ptr64
    @   00007FFE950541C2  (unknown)
    @   00007FFEBEFB2892  _Py_CheckFunctionResult
    @   00007FFEBEFB4C8B  _PyEval_EvalFrameDefault
    @   00007FFEBEFAFFB8  _PyEval_EvalCodeWithName
    @   00007FFEBEFB2D16  _Py_CheckFunctionResult
    @   00007FFEBEFB50C2  _PyEval_EvalFrameDefault
    @   00007FFEBEFAFFB8  _PyEval_EvalCodeWithName
    @   00007FFEBEFB0F1F  _PyFunction_Vectorcall
    @   00007FFEBEF89E36  PyVectorcall_Call
    @   00007FFEBEF89CC7  PySequence_GetItem
    @   00007FFEBEFB5B05  _PyEval_EvalFrameDefault
    @   00007FFEBEFAFFB8  _PyEval_EvalCodeWithName
    @   00007FFEBEFB0F1F  _PyFunction_Vectorcall
    @   00007FFEBEF89E36  PyVectorcall_Call
    @   00007FFEBEF89CC7  PySequence_GetItem
    @   00007FFEBEFB5B05  _PyEval_EvalFrameDefault
    @   00007FFEBEFAFFB8  _PyEval_EvalCodeWithName
    @   00007FFEBEFB2D16  _Py_CheckFunctionResult
    @   00007FFEBEFB50C2  _PyEval_EvalFrameDefault
    @   00007FFEBEFAFFB8  _PyEval_EvalCodeWithName
    @   00007FFEBEF7A9BF  PyEval_EvalCodeEx
F0320 22:30:47.143532 34916 15208 raylet_client.cc:54] Could not connect to socket tcp://127.0.0.1:63679
*** Check failure stack trace: ***
    @   00007FFE9538174B  public: void __cdecl google::LogMessage::Flush(void) __ptr64
    @   00007FFE953804E2  public: __cdecl google::LogMessage::~LogMessage(void) __ptr64
    @   00007FFE953494F8  public: virtual __cdecl google::NullStreamFatal::~NullStreamFatal(void) __ptr64
    @   00007FFE951A2ACB  PyInit__raylet
    @   00007FFE951A160C  PyInit__raylet
    @   00007FFE9512BFE2  PyInit__raylet
    @   00007FFE95141EEC  PyInit__raylet
    @   00007FFE9512F36F  PyInit__raylet
    @   00007FFE95154937  PyInit__raylet
    @   00007FFE950C84C1  public: virtual void __cdecl google::LogSink::WaitTillSent(void) __ptr64
    @   00007FFE950EAAB7  public: virtual void __cdecl google::LogSink::WaitTillSent(void) __ptr64
    @   00007FFE950541C2  (unknown)
    @   00007FFEBEFB2892  _Py_CheckFunctionResult
    @   00007FFEBEFB4C8B  _PyEval_EvalFrameDefault
    @   00007FFEBEFAFFB8  _PyEval_EvalCodeWithName
    @   00007FFEBEFB2D16  _Py_CheckFunctionResult
    @   00007FFEBEFB50C2  _PyEval_EvalFrameDefault
    @   00007FFEBEFAFFB8  _PyEval_EvalCodeWithName
    @   00007FFEBEFB0F1F  _PyFunction_Vectorcall
    @   00007FFEBEF89E36  PyVectorcall_Call
    @   00007FFEBEF89CC7  PySequence_GetItem
    @   00007FFEBEFB5B05  _PyEval_EvalFrameDefault
    @   00007FFEBEFAFFB8  _PyEval_EvalCodeWithName
    @   00007FFEBEFB0F1F  _PyFunction_Vectorcall
    @   00007FFEBEF89E36  PyVectorcall_Call
    @   00007FFEBEF89CC7  PySequence_GetItem
    @   00007FFEBEFB5B05  _PyEval_EvalFrameDefault
    @   00007FFEBEFAFFB8  _PyEval_EvalCodeWithName
    @   00007FFEBEFB2D16  _Py_CheckFunctionResult
    @   00007FFEBEFB50C2  _PyEval_EvalFrameDefault
    @   00007FFEBEFAFFB8  _PyEval_EvalCodeWithName
    @   00007FFEBEF7A9BF  PyEval_EvalCodeEx
QuanyiLi commented 3 months ago

Sorry, I have no idea. It is something raised by Ray. How many workers are you using? Does this still persist if you only use one worker?

huyuening commented 3 months ago
ERROR syncer.py:63 -- Log sync requires rsync to be installed.

Is the reason related to the lack of rsync in Windows?

QuanyiLi commented 3 months ago

Not sure. You can search related stuff in Ray's GitHub issue list.