Continue to turn right when use non-discrete_actions

Panshark commented 2 years ago

Hi @praveen-palanisamy ,

I want to test how the model performs in continuous action space, so I change to this:

ENV_CONFIG = {
    "discrete_actions": False,
    "use_image_only_observations": True,  # Exclude high-level planner inputs & goal info from the observations
    "server_map": "/Game/Maps/" + city,
    "scenarios": [scenario_config["Lane_Keep_Town2"]],
    "framestack": 2,  # note: only [1, 2] currently supported
    "enable_planner": True,
    "use_depth_camera": False,
    "early_terminate_on_collision": True,
    "verbose": False,
    "render" : True,  # Render to display if true
    "render_x_res": 800,
    "render_y_res": 600,
    "x_res": 80,
    "y_res": 80,
    "seed": 1
}

However, I only see the car keep turning right after I training it for around 10M steps. Do you have any idea to solve this problem?

Thanks a lot!

praveen-palanisamy commented 2 years ago

Great to see that you are training agents with continuous action space on this driving problem! While the 10M steps does sound reasonably large, for the relatively high-dimensional continuous control from pixel problems like this, it is typical to train for much longer steps even with the state-of-the-art RL algorithms. How does your training graphs (reward plots from Tensorboard) look like? Do you see the Car Agent completing the scenario at least once in a while? I am assuming you are using A3C for this? Also, it will help to see what you are observing (a GIF/video of the Agent's performance) along with the Tensorboard plots to debug this further

Panshark commented 2 years ago

Great to see that you are training agents with continuous action space on this driving problem! While the 10M steps does sound reasonably large, for the relatively high-dimensional continuous control from pixel problems like this, it is typical to train for much longer steps even with the state-of-the-art RL algorithms. How does your training graphs (reward plots from Tensorboard) look like? Do you see the Car Agent completing the scenario at least once in a while? I am assuming you are using A3C for this? Also, it will help to see what you are observing (a GIF/video of the Agent's performance) along with the Tensorboard plots to debug this further

The reward is nearly -1 after 10M steps, it increases really slowly. Never saw an agent completing the scenario.

BTW, do you know how could I output the GIF or Video to the tensorboard from Carla? Sometimes I work at home, so it is hard to see the graphic interface, I've been trying to solve this problem.

praveen-palanisamy commented 2 years ago

Hi @Panshark , Unfortunately, A3C does take longer to train in this case for image observations and continuous-valued actions.

You can log images to Tensorboard using:

with file_writer.as_default():
  tf.summary.image("Agent observation", img, step=0)

There's a detailed tutorial/documentation here: https://www.tensorflow.org/tensorboard/image_summaries

For remote watching/debugging, you could also periodically write a GIF/video of an Agent's performance over an episode to the disk and download them to view on your local machine. For the video recording, you can use OpenAI's Monitor's recorder wrapper in your code like this:

import gym
import carla_gym
from gym.wrappers import Monitor
env = Monitor(gym.make('Carla-v0'), './video', force=True)
...

Hope that helps.

Panshark commented 2 years ago

Hi @Panshark , Unfortunately, A3C does take longer to train in this case for image observations and continuous-valued actions.

You can log images to Tensorboard using:
with file_writer.as_default():
  tf.summary.image("Agent observation", img, step=0)
There's a detailed tutorial/documentation here: https://www.tensorflow.org/tensorboard/image_summaries

For remote watching/debugging, you could also periodically write a GIF/video of an Agent's performance over an episode to the disk and download them to view on your local machine. For the video recording, you can use OpenAI's Monitor's recorder wrapper in your code like this:
import gym
import carla_gym
from gym.wrappers import Monitor
env = Monitor(gym.make('Carla-v0'), './video', force=True)
...
Hope that helps.

Hi @praveen-palanisamy ,

Really thanks for sharing how to save recording, that is really helpful.

I am trying to use Carla 0.9 these days, and I have another pickle problem during that. Could you check the issue I opened in macad-gym issue #58

praveen-palanisamy commented 2 years ago

@Panshark, thank you for reporting back. Closing this issue.

PacktPublishing / Hands-On-Intelligent-Agents-with-OpenAI-Gym

Continue to turn right when use non-discrete_actions #37