jr-robotics / robo-gym

An open source toolkit for Distributed Deep Reinforcement Learning on real and simulated robots.
https://sites.google.com/view/robo-gym
MIT License
428 stars 75 forks source link

Stable-Baselines ERROR #79

Open lgortiz1 opened 10 months ago

lgortiz1 commented 10 months ago

**Hello community, I am trying to perform the UR3 training about end-effector positioning with the use of Stable-Baselines.

I am using Ubuntu 20.04 ROS noetic python 3.8 Stable-Baselines3 = 2.0.0.0 tf = 2

I have seen in the Stable-Baselines example that it uses the previous version, but when using python 3.8 it is not possible to install TF=1.15 for Stable-Baselines. So I chose to use Stable-Baselines3, when I run the code for the MiR100 robot it works fine.

The problem is when I change the environment to UR3 "EndEffectorPositioningURSim-v0", the simulation runs in gazebo fine, but when I run the line "model.learn(total_timesteps=15000)" it generates the following error

can anyone help me how can I fix the error, or any recommendation of versions in which stable baseline works well. I have thought about moving everything to ubuntu 18.04, but the idea is to keep developing some new not to go backwards.**


/bin/python3 /home/luigi/robogym_ws/src/robo-gym/docs/examples/stable-baselines/td3_script.py
2024-01-25 19:29:25.298682: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-01-25 19:29:25.376278: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-01-25 19:29:26.503590: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Starting new Robot Server | Tentative 1 of 10
Successfully started Robot Server at 127.0.0.1:49103
/home/luigi/.local/lib/python3.8/site-packages/gym/spaces/box.py:127: UserWarning: WARN: Box bound precision lowered by casting to float32
  logger.warn(f"Box bound precision lowered by casting to {self.dtype}")
Using cuda device
Wrapping the env with a `Monitor` wrapper
Wrapping the env in a DummyVecEnv.
/home/luigi/.local/lib/python3.8/site-packages/gym/utils/passive_env_checker.py:174: UserWarning: WARN: Future gym versions will require that `Env.reset` can be passed a `seed` instead of using `Env.seed` for resetting the environment random number generator.
  logger.warn(
/home/luigi/.local/lib/python3.8/site-packages/gym/utils/passive_env_checker.py:187: UserWarning: WARN: Future gym versions will require that `Env.reset` can be passed `options` to allow the environment initialisation to be passed additional information.
  logger.warn(
/home/luigi/.local/lib/python3.8/site-packages/gym/utils/passive_env_checker.py:195: UserWarning: WARN: The result returned by `env.reset()` was not a tuple of the form `(obs, info)`, where `obs` is a observation and `info` is a dictionary containing additional information. Actual type: `<class 'numpy.ndarray'>`
  logger.warn(
/home/luigi/.local/lib/python3.8/site-packages/gym/utils/passive_env_checker.py:219: DeprecationWarning: WARN: Core environment is written in old step API which returns one bool instead of two. It is recommended to rewrite the environment with new step API. 
  logger.deprecation(
/home/luigi/.local/lib/python3.8/site-packages/gym/utils/passive_env_checker.py:225: DeprecationWarning: `np.bool8` is a deprecated alias for `np.bool_`.  (Deprecated NumPy 1.24)
  if not isinstance(done, (bool, np.bool8)):
Traceback (most recent call last):
  File "/home/luigi/robogym_ws/src/robo-gym/docs/examples/stable-baselines/td3_script.py", line 19, in <module>
    model.learn(total_timesteps=15000)
  File "/home/luigi/.local/lib/python3.8/site-packages/stable_baselines3/td3/td3.py", line 195, in learn
    return super(TD3, self).learn(
  File "/home/luigi/.local/lib/python3.8/site-packages/stable_baselines3/common/off_policy_algorithm.py", line 266, in learn
    rollout = self.collect_rollouts(
  File "/home/luigi/.local/lib/python3.8/site-packages/stable_baselines3/common/off_policy_algorithm.py", line 472, in collect_rollouts
    new_obs, reward, done, infos = env.step(action)
  File "/home/luigi/.local/lib/python3.8/site-packages/stable_baselines3/common/vec_env/base_vec_env.py", line 163, in step
    return self.step_wait()
  File "/home/luigi/.local/lib/python3.8/site-packages/stable_baselines3/common/vec_env/dummy_vec_env.py", line 51, in step_wait
    return (self._obs_from_buf(), np.copy(self.buf_rews), np.copy(self.buf_dones), deepcopy(self.buf_infos))
  File "/usr/lib/python3.8/copy.py", line 146, in deepcopy
    y = copier(x, memo)
  File "/usr/lib/python3.8/copy.py", line 205, in _deepcopy_list
    append(deepcopy(a, memo))
  File "/usr/lib/python3.8/copy.py", line 146, in deepcopy
    y = copier(x, memo)
  File "/usr/lib/python3.8/copy.py", line 230, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/usr/lib/python3.8/copy.py", line 161, in deepcopy
    rv = reductor(4)
TypeError: cannot pickle 'google.protobuf.pyext._message.ScalarMapContainer' object

Captura de pantalla de 2024-01-25 19-43-48 Captura de pantalla de 2024-01-25 19-46-19

f4rh4ng commented 8 months ago

Hi, Check out the following issue in the forum. It will most likely help solving your problem. 🙂

https://github.com/jr-robotics/robo-gym/issues/58