TypeError: catching classes that do not inherit from BaseException is not allowed

zichunxx commented 7 months ago

Describe the bug

Hi!

I was following the stable_baseline tutorial to reproduce a training script for PandaPickAndPlace-v3 env. When I ran the following code snippet, I met this error:

pybullet build time: May 20 2022 19:45:31
argv[0]=--background_color_red=0.8745098114013672
argv[1]=--background_color_green=0.21176470816135406
argv[2]=--background_color_blue=0.1764705926179886
mean_reward: -48.50 +/- 8.53
Exception ignored in: <function BulletClient.__del__ at 0x7f568159a1f0>
Traceback (most recent call last):
  File "/home/xzc/mambaforge/envs/sb-zoo/lib/python3.9/site-packages/pybullet_utils/bullet_client.py", line 43, in __del__
TypeError: catching classes that do not inherit from BaseException is not allowed

It seems that this error is triggered by pybullet. For further confirmation, I created another CartPole-v1 env with the PPO policy, and this error was gone. If the bullet is shut down in an improper way in the PandaPickAndPlace-v3 env script?

Thanks!

To Reproduce

Provide a minimal code :

import gymnasium as gym
import numpy as np
from stable_baselines3 import HER, HerReplayBuffer, SAC, DDPG
from stable_baselines3.td3.policies import MultiInputPolicy
from stable_baselines3.common.evaluation import evaluate_policy
from stable_baselines3 import PPO
import panda_gym
from stable_baselines3.ppo.policies import MlpPolicy

env = gym.make("PandaPickAndPlace-v3")
model = DDPG(MultiInputPolicy, env, verbose=0)

# env = gym.make("CartPole-v1")
# model = PPO(MlpPolicy, env, verbose=0)

# mean_reward_before_train = evaluate(model, num_episodes=100, deterministic=True)
mean_reward, std_reward = evaluate_policy(model, env, n_eval_episodes=100, warn=False)
print(f"mean_reward: {mean_reward:.2f} +/- {std_reward:.2f}")

System

OS: Ubuntu 20.04
Python version (python --version): Python 3.9.18
Package version (pip list | grep panda-gym): 3.0.7

qgallouedec commented 7 months ago

First of all, note that the error is only triggered when the objects are destroyed (so your code works all the way through, and isn't stopped early).

Secondly, I'm aware of this error, and it's actually a pybullet issue. In its magic method __del__, it tries to catch an error that doesn't inherits from BaseException. But in any case, even if this error is correctly caught, there are no instructions (see pybullet source code). Hence, you can safely ignore it.

zichunxx commented 7 months ago

Thanks for your quick response!

qgallouedec commented 7 months ago

For the record, I'm not able to reproduce the error, in my case the code runs without error with both algorithms

zichunxx commented 7 months ago

Could you please tell me your system and lib version information? I want to give it a try.

qgallouedec commented 7 months ago

In fact, I've managed to reproduce it, so here's an explanation:

When you don't close an environment (no env.close()), pybullet takes care of closing it at the end of the script (when all objects are destroyed), and disconnecting the simulation server. However, pybullet doesn't do this correctly (see above). To avoid this error, simply add an env.close() to close the environment properly:

import gymnasium as gym
from stable_baselines3 import DDPG
from stable_baselines3.common.evaluation import evaluate_policy
import panda_gym

env = gym.make("PandaPickAndPlace-v3")
model = DDPG("MultiInputPolicy", env)
evaluate_policy(model, env, n_eval_episodes=100, warn=False)
env.close()

zichunxx commented 7 months ago

It works! Thanks again for your positive help.

qgallouedec / panda-gym