DLR-RM / stable-baselines3

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
https://stable-baselines3.readthedocs.io
MIT License
8.38k stars 1.61k forks source link

Training of PPO freezes after number of iterations #1886

Closed Ahmed-Radwan094 closed 3 months ago

Ahmed-Radwan094 commented 3 months ago

🐛 Bug

I built a custom Carla environment and implemented a script to train it with PPO. The training runs without any errors, however after a number of iterations, typically 50-60k, the code freezes. I validated that the code is stuck and doesn't call step function of the environment anymore.

Code example

try:
    carla_env = Carla(config)

    # create a custom feature extractor in stable baselines
    policy_kwargs = dict(
        features_extractor_class=CarlaDrivingPolicy,
        features_extractor_kwargs=dict(config=config),
    )

    model = PPO("MultiInputPolicy", carla_env, policy_kwargs=policy_kwargs,
                **config['RL']['PPO_algo_params'], verbose=1)

    # set up the model logger
    logger_path = config['RL']['logger_path']
    logger_object = configure(logger_path, ["stdout", "csv", "tensorboard"])
    model.set_logger(logger_object)

    # define the model path
    model_path = config['RL']['model_path']
    # set up checkpoint callback to save model at a certain frequency
    checkpoint_callback = CheckpointCallback(
        save_freq=config['RL']['checkpoint_save_freq'] // n_envs,
        save_path=model_path,
        name_prefix=rl_algorithm + "_carla"
    )

    # train the agent
    model.learn(**config['RL']['train_params'], log_interval=10, callback=checkpoint_callback)
    # close carla environment
    print("Training complete")
    carla_env.close()
# handle exception
except Exception:
    if carla_env:
        carla_env.close()
    traceback.print_exc()

Relevant log output / Error message

No response

System Info

Checklist

qgallouedec commented 3 months ago

Hey, have you checked you env with the env checker? Can you share the logs? What do you mean by freeze?

Ahmed-Radwan094 commented 3 months ago

Hey, thank you for the quick reply. I checked the env with env checker and I only received one warning from the action type casting.

/home/ahmed/miniconda3/envs/baselines_env/lib/python3.8/site-packages/gymnasium/spaces/box.py:130: UserWarning: WARN: Box bound precision lowered by casting to float32 gym.logger.warn(f"Box bound precision lowered by casting to {self.dtype}")

Ahmed-Radwan094 commented 3 months ago

By freeze, I mean the learn function is stuck in one step for more than an hour. There are no errors or process killed, and I verified Carla is alive and can be pinged.

qgallouedec commented 3 months ago

Have you tried to use your debugger and pause the process to see which line is involved?

Ahmed-Radwan094 commented 3 months ago

No, I didn't. The problem is this happens after large number of iterations, around 50-60k (random between each run) and I am not sure if it would be feasible to debug. I have verified that the step function is called, and suddenly it stops being called, and no new commands are received. Is there a way to log information about the current function being called in model.learn(...)

qgallouedec commented 3 months ago

I don't know which debugger you use, but you can usually manually pause whenever you want. As this is a custom environment, your best bet is to reduce your code as much as possible to converge on an MRE.

Ahmed-Radwan094 commented 3 months ago

I will try that and update the ticket. Thank you for the support.

Ahmed-Radwan094 commented 3 months ago

There was an exception in the environment itself and now it is fixed.