Henry1iu / ierg5350_rl_course_project

IERG5350 Reinforcement Learning Course Project based on the Stanford AI lab's work on multimodal representation.
28 stars 4 forks source link

Reinforcement Learning of Peg Insertion Robot Arm Agent with Multimodal Sensor Fusion

A prelimilary version of the python implementation. The code is not well organized currently.

We will release a nicer version later. ( _(:з」∠)_ painful final exams...)

The idea of this project is inspired by the papers written by Michelle Lee, Yuke Zhu and etc.:

Some of the code are taken from their implementation: https://github.com/stanford-iprl-lab/multimodal_representation

The PPO trainer deployed is borrowed from the Assignment 5 of IERG5350 - Reinforcement Learning: https://github.com/cuhkrlcourse/ierg5350-assignment

The borrowed code has been modified to fit in this application.

The simulation environment is constructed using pybullet. Basicly, it contains a kuka robot arm, a cover box and a button inside the box. There is a hole at the upper side of the box. The kuka's end-effector(the peg) can only press the button by inserting the peg into the hole. The agent will gain 10 reward with touching the cover box and 50 reward with pressing the button. A detailed explainantion will be released later (maybe not).

TODO

  1. A nicer implementation;
  2. Enable simulation parallelism (run multiple simulation at a time);
  3. Variational training for the sensor fusion encoder;

requirements

pip install -r requirements.txt

train the agent

python train_peg_insertion.py

collect the multimodal dataset for encoder pre-train

python environments/kuka_peg_env.py

[Note] You will be able to get more data by changing the random seed.

pre-train the fusion encoder

python multimodal/train_my_fusion_model.py

[Note] Specify the path to the root directory of multimodal dataset