Notice: at the moment we are not actively maintaining this repository so we may not be able to reply to issues in a timely manner.
pybullet-robot-envs is a Python package that collects robotic environments based on the PyBullet simulator, suitable to develop and test Reinforcement Learning algorithms on simulated grasping and manipulation applications.
The pybullet-robot-envs inherit from the OpenAI Gym interface.
The package provides environments for the iCub Humanoid robot and the Franka Emika Panda manipulator.
This repository is part of a project which aims to develop Reinforcement Learning approaches for the accomplishment of grasp and manipulation tasks with the iCub Humanoid robot and the Franka Emika Panda robot.
A Reinforcement Learning based approach generally includes two basics modules: the environment, i.e. the world, and the agent, i.e. the algorithm. The agent sends action to the environment, which replies with observations and rewards. This repository provides environments with OpenAI Gym interface that interact with the PyBullet module to simulate the robotic tasks and the learned policies.
Simulators are a useful resource to implement and test Reinforcement Learning algorithm on a robotic system before porting them to the real-world platform, in order to avoid any risk for the robot and the environment. PyBullet is a Python module for physics simulation for robotics, visual effect and reinforcement learning based on the Bullet Physics SDK. See PyBullet Quickstart Guide for specific information. Its main features are:
The pybullet-robot-envs environments adopt the OpenAI Gym environment interface, that has become as sort of standard in the RL world. RL agents can easily interact with different environments through this common interface without any additional implementation effort. An OpenAI Gym interface has the following basic methods:
pybullet-robot-envs requires python3 (>=3.5).
Before installing the required dependencies, you may want to create a virtual environment and activate it:
$ virtualenv ve_pybullet
$ source ve_pybullet/bin/activate
Install git lfs by following instructions here: git-lfs.github.com.
Clone the repository:
$ git clone https://github.com/robotology-playground/pybullet-robot-envs.git
$ cd pybullet-robot-envs
Install the dependencies necessary to run and test the environments with PyBullet:
$ pip3 install -r requirements.txt
Note: Installing the requirements will install also Stable Baselines.
Install the package:
$ pip3 install -e .
After this step, they can be instantiated in any script by doing:
import gym
env = gym.make('pybullet_robot_envs:iCubReach-v0')
where iCubReach-v0
is the environment id. You can check the available environment ids in the file pybullet_robot_envs/init.py. If you create a new environment and you want to register it as Gym environment, you can modify this file by adding a new register( id=<id_env>, entry_point=path_to_import_env>)
. See this guide for detailed instruction.
You can test your installation by running the following basic robot simulations on PyBullet:
$ python pybullet_robot_envs/examples/helloworlds/helloworld_icub.py
$ python pybullet_robot_envs/examples/helloworlds/helloworld_panda.py
The README.md file provides detailed information about the robotic environments of the repository. In general, there are three types of environments:
Run the following script to open an interactive GUI in PyBullet and test the iCub environment:
Forward Kinematic: use the sliders to change the positions of the iCub arm's joints
$ python pybullet_robot_envs/examples/test_envs/test_icub_push_gym_env.py --arm l --joint
Inverse Kinematic: use the sliders to change the (x,y,z),(roll,pitch,yaw) position of the iCub's hand
$ python pybullet_robot_envs/examples/test_envs/test_icub_push_gym_env.py --arm l
Random policy: test a random policy, both in cartesian and joint control spaces by adding --random_policy
$ python pybullet_robot_envs/examples/test_envs/test_icub_push_gym_env.py --arm l --random_policy
Run the following script to open an interactive GUI in pybullet and test the Panda environment:
Forward Kinematic: use the sliders to change the positions of the iCub arm's joints
$ python pybullet_robot_envs/examples/test_envs/test_panda_push_gym_env.py
Inverse Kinematic: use the sliders to change the (x,y,z),(roll,pitch,yaw) position of the iCub's hand
$ python pybullet_robot_envs/examples/test_envs/test_panda_push_gym_env.py --cart
Random policy: test a random policy, both in cartesian and joint control spaces by adding --random_policy
$ python pybullet_robot_envs/examples/test_envs/test_panda_push_gym_env.py --random_policy
Run the following scripts to train and test the implemented environments with standard DDPG algorithm from Stable Baselines.
You can find more examples in the repository https://github.com/eleramp/robot-agents which is a Python-based framework composed of two main cores:
Train iCub to perform a push task by using Stable Baselines implementation of DDPG (continuous action space):
$ python pybullet_robot_envs/examples/algos/train/baselines/icub_envs/train_ddpg_pushing.py
The trained model is saved as .pkl in the ../pybullet_logs/icubpush_ddpg
folder, together with some log files, as tensorboard files.
Track the learning process with tensorboard:
$ tensorboard --logdir ../pybullet_logs/icubpush_ddpg
TensorBoard 1.13.1 at <url>:6006 (Press CTRL+C to quit)
<url>:6006
into the web browser and track the mean reward per episodeTest the trained model on episodes of 1000 timestamps:
$ python pybullet_robot_envs/examples/algos/test/baselines/icub_envs/test_ddpg_pushing.py
Train panda to perform a reach task by using Stable Baselines implementation of DDPG (continuous action space):
$ python pybullet_robot_envs/examples/algos/train/baselines/panda_envs/train_ddpg_reaching.py
The trained model is saved as .pkl in the ../pybullet_logs/pandareach_ddpg
folder, together with some log files, as tensorboard files.
Track the learning process with tensorboard:
$ tensorboard --logdir ../pybullet_logs/pandareach_ddpg
TensorBoard 1.13.1 at <url>:6006 (Press CTRL+C to quit)
<url>:6006
into the web browser and track the mean reward per episodeTest the trained model on episodes of 1000 timestamps:
$ python pybullet_robot_envs/examples/algos/test/baselines/panda_envs/test_ddpg_reaching.py