Closed Ahmed-Radwan094 closed 3 months ago
Hey, have you checked you env with the env checker? Can you share the logs? What do you mean by freeze?
Hey, thank you for the quick reply. I checked the env with env checker and I only received one warning from the action type casting.
/home/ahmed/miniconda3/envs/baselines_env/lib/python3.8/site-packages/gymnasium/spaces/box.py:130: UserWarning: WARN: Box bound precision lowered by casting to float32 gym.logger.warn(f"Box bound precision lowered by casting to {self.dtype}")
By freeze, I mean the learn function is stuck in one step for more than an hour. There are no errors or process killed, and I verified Carla is alive and can be pinged.
Have you tried to use your debugger and pause the process to see which line is involved?
No, I didn't. The problem is this happens after large number of iterations, around 50-60k (random between each run) and I am not sure if it would be feasible to debug. I have verified that the step function is called, and suddenly it stops being called, and no new commands are received. Is there a way to log information about the current function being called in model.learn(...)
I don't know which debugger you use, but you can usually manually pause whenever you want. As this is a custom environment, your best bet is to reduce your code as much as possible to converge on an MRE.
I will try that and update the ticket. Thank you for the support.
There was an exception in the environment itself and now it is fixed.
🐛 Bug
I built a custom Carla environment and implemented a script to train it with PPO. The training runs without any errors, however after a number of iterations, typically 50-60k, the code freezes. I validated that the code is stuck and doesn't call step function of the environment anymore.
Code example
Relevant log output / Error message
No response
System Info
Checklist