GazzolaLab / Elastica-RL-control

Code for the cases presented in the paper "Elastica: A compliant mechanics environment for soft robotic control"
MIT License
26 stars 9 forks source link

Elastica + RL benchmark cases

This repo provides supplementary code and benchmark data for the paper: Elastica: A compliant mechanics environment for soft robotic control, in IEEE Robotics and Automation Letters.

Elastica is a simulation environment for simulating assemblies of one-dimensional soft, slender structures using Cosserat rod theory. More information about Elastica is available on the project website. You can install the Python version of Elastica via pip install pyelastica.

In this repo, Elastica is interfaced with Stable Baselines to investigate how RL can dynamically control a compliant robotic arm. You can install Stable Baselines via pip install stable-baselines[mpi] (note: Stable Baselines only works with TesorFlow <= v1.15). Five different RL model-free algorithms from the Stable Baselines implementations are used. Two of them are on-policy algorithms: Trust Region Policy Optimization (TRPO) and Proximal Policy Optimization (PPO) and three of them are off-policy algorithms: Soft Actor Critic (SAC), Deep Deterministic Policy Gradient (DDPG), and Twin Delayed DDPG (TD3). Four different cases are considered with detailed explanations given in the paper.

If you discover any bugs, please open an issue and let us know. We plan to actively maintain and develop these benchmark cases.

We have provided visualization scripts we use for these cases in the visualization folder. Data from hyperparameter tuning is available in the supplementary_data folder.

Case 1: 3D tracking of a randomly moving target

In this case, the arm is continuously tracking a randomly moving target in 3D space. Actuations are only allowed in normal and binormal directions with 6 control points in each direction.

Case 2: Reaching to randomly located target with a defined orientation

In this case, the arm is reaching the randomly positioned stationary target, while re-orienting itself to match the orientation of the target. Actuations are allowed in normal, binormal, and tangent directions with 6 control points in each direction.

Case 3: Underactuated maneuvering between structured obstacles

In this case, the arm is reaching a stationary target placed behind an array of eight obstacles with an opening through which the arm must maneuver to reach the target. Target is placed in the normal plane so that only in-plane actuation is required. Thus actuation only in the normal direction is allowed. Case 3 has two subcases. First one is training using 2 manually placed control points at 40% and 90% of the arm and the second one is training using 2, 4, 6, and 8 equidistant control points. Code for manually selected control points are located in Case3/Case3_main-text/ folder and code for equidistant control points are located in Case3/Case3_SI-ctrl_pts/.

Case 4: Underactuated maneuvering between unstructured obstacles

In this case, the arm is reaching a stationary target by maneuvering around an unstructured nest of twelve randomly located obstacles. Actuation for this case is similar to Case 3, using two manually placed control points at 40% and 90% of the arm. Different than Case 3 actuations in normal and binormal directions are allowed.

Citation

We ask that any publications which use these benchmark cases cite the original paper:

Naughton, Sun, Tekinalp, Parthasarathy, Chowdhary and Gazzola, Elastica: A compliant mechanics environment for soft robotic control, IEEE Robotics and Automation Letters, 2021. doi: 10.1109/LRA.2021.3063698

@article{Naughton2021,
  author={Naughton, Noel and Sun, Jiarui and Tekinalp, Arman and Parthasarathy, Tejaswin and Chowdhary, Girish and Gazzola, Mattia},
  journal={IEEE Robotics and Automation Letters}, 
  title={Elastica: A compliant mechanics environment for soft robotic control}, 
  year={2021},
  volume={6},
  number={2},
  pages={3389-3396},
  doi={10.1109/LRA.2021.3063698}
}