jon--lee / ilqr

Iterative Linear-Quadratic Regulator / Differential Dynamic Programming
2 stars 1 forks source link

cannot control the pendulum #1

Open dujinyu opened 4 years ago

dujinyu commented 4 years ago

hello! Did you successfully complete the control of the inverted pendulum?

iteration:  0  cost: 46906.8821282
iteration:  1  cost: 39965.4489433
iteration:  2  cost: 44619.0937448
iteration:  3  cost: 39615.7513945
iteration:  4  cost: 44339.6326178
iteration:  5  cost: 39556.862434
iteration:  6  cost: 44327.1138424
iteration:  7  cost: 39559.1609379
iteration:  8  cost: 44328.4758652
iteration:  9  cost: 39560.1271501
iteration:  10  cost: 44328.7161597
iteration:  11  cost: 39561.0841868
iteration:  12  cost: 44327.69723
iteration:  13  cost: 39561.4823859
iteration:  14  cost: 44331.4920414
iteration:  15  cost: 39559.2656203
iteration:  16  cost: 44327.5352405
iteration:  17  cost: 39561.163429
iteration:  18  cost: 44328.2883172
iteration:  19  cost: 39563.1459986
iteration:  20  cost: 44328.9058601
iteration:  21  cost: 39561.4135418
iteration:  22  cost: 44333.5054892
iteration:  23  cost: 39557.5357438
iteration:  24  cost: 44330.0401133
iteration:  25  cost: 39565.0163919
iteration:  26  cost: 44330.2156543
iteration:  27  cost: 39559.3820782
iteration:  28  cost: 44327.5159278
iteration:  29  cost: 39558.5928071
iteration:  30  cost: 44331.9625728
iteration:  31  cost: 39559.775944
iteration:  32  cost: 44324.5068705
iteration:  33  cost: 39559.9322083
iteration:  34  cost: 44328.0646829
iteration:  35  cost: 39560.7580598
iteration:  36  cost: 44332.6083168
iteration:  37  cost: 39564.2807349
iteration:  38  cost: 44329.03292
iteration:  39  cost: 39556.5351339

the costs of 40 iterations are listed above. it seems that the code did not converge.

jon--lee commented 4 years ago

Hi! Thanks for pointing this out. I haven't looked at the code for a few years, but it was working when I uploaded it. Here are some potential causes for failure:

  1. Are you using pendulum-v0 from OpenAI gym? Unfortunately I don't remember what version of gym I was using but it's possible that could also be a factor.
  2. I was not able to get this to work at all without removing the control clipping that gym automatically adds to the environment. In your own gym files, you may have to remove their clipping to get it to work.
  3. Have you tried running it for longer or using different initial conditions? ilqr generally does not globally optimize so it may just be a local minimum issue. Hope this helps!