https://github.com/shamilmamedov/flexible_arm/assets/59015432/d1e1a90e-e378-4fa5-a520-d8d2ca81f6c2
https://github.com/shamilmamedov/flexible_arm/assets/59015432/b3b2f49b-bc46-4bc9-8e92-94a316daca1a
Installation of acados according to the following instructions: https://docs.acados.org/python_interface/index.html
Current (21 August 2023) version on imitation library does not yet support Gymnasium. So we are using our own fork of it with necessary modifications.
After cloning this repo:
git submodule init
git submodule update
cd imitation
pip install -e .
Hyper-parameter | Value |
---|---|
COMMON: Learning Rate | 0.0003 |
COMMON: Number of Expert Demos | 100 |
COMMON: Number of Training Steps | 2,000,000 |
PPO: Net. Arch. | pi:[256, 256] vf:[256, 256] |
PPO: Batch Size | 64 |
SAC: Net. Arch. | pi:[256, 256] qf:[256, 256] |
SAC: Batch Size | 256 |
BC: Net. Arch. | pi:[32, 32] qf:[32, 32] |
BC: Batch Size | 32 |
DAgger: Online Episodes | 500 |
Density: Kernel type | Gaussian |
Density: Kernel bandwidth | 0.5 |
Density: Net. Arch. | pi:[256, 256] qf:[256, 256] |
GAIL: Reward Net Arch. | [32, 32] |
GAIL: Policy Net Arch. | pi:[256, 256] qf:[256, 256] |
GAIL: Policy Replay Buffer Capacity | 512 |
GAIL: Batch Size | 128 |
AIRL: Reward Net Arch. | [32, 32] |
AIRL: Policy Net Arch. | pi:[256, 256] qf:[256, 256] |
AIRL: Batch Size | 128 |
AIRL: Policy Replay Buffer Capacity | 512 |
Parameter | Value |
---|---|
Hessian Approximation | Gauss-Newton |
SQP type | real-time iterations |
$\Delta t$, $N$, $n_\mathrm{seg}$ | $5$ ms, 125, 3 |
$Q$ weights $w_{qa}$, $\dot w{qa}$, $w{qp}$, $\dot{w}{q_p}$ | $0.01 \; 0.1 \; 0.01 \; 10$ |
$P_N$ | diag($[1,1,1,0,0,0])\cdot 10^4$ |
$P$ | diag($[1,1,1,0,0,0])\cdot 2\cdot10^3$ |
$R$ | diag($[1,10,10]$) |
$S$, $s$ | diag($[1,1,1]\cdot 10^6$), $[1,1,1]^\top\cdot 10^4$ |
$\delta\mathrm{ee}, \delta\mathrm{elb}$ , $\delta_\mathrm{x}$ | $0.01\mathrm{m}, \;0.005\mathrm{m}$, \; $0\cdot 1_{n_x}$ |
$\overline{\dot{q_a}}=-\underline{\dot{q_a}}$ | $[2.5, 3.5, 3.5]^\top\;s^{-1}$ |
$\overline{u}=-\underline{u}$ | $[20,10,10]^\top$ Nm |
Parameter | Value |
---|---|
$\Delta t\mathrm{SF}$, $N\mathrm{SF}$, $n_\mathrm{seg}$ | $10$ ms, $25$, $1$ |
$\bar{R}$ | diag($[1,1,1]$) |
${R}_\mathrm{SF}$ | diag($[1,1,1]$) $\cdot 10^{-5}$ |