stulp / dmpbbo

Python/C++ library for Dynamical Movement Primitives and Black-Box Optimization
GNU Lesser General Public License v2.1
226 stars 89 forks source link

Sequences of DMPs #53

Closed adamconkey closed 4 years ago

adamconkey commented 5 years ago

I am curious if your work regarding policy improvement for sequences of movement primitives is in the scope of this repository, for example the methods described in "Reinforcement Learning with Sequences of Motion Primitives for Robust Manipulation".

stulp commented 5 years ago

That's a very good question, with a rather complicated answer.

After the work you cite above, I found out that a much simpler class of black-box algorithms outperforms PI2 when using quasi open-loop policies such as DMPs: https://www.degruyter.com/downloadpdf/j/pjbr.2013.4.issue-1/pjbr-2013-0003/pjbr-2013-0003.pdf That is why PI2 is not implemented in this library, because from a user's point of view, simpler black-box optimization algorithms will converge faster.

Explaining this properly in the documentation is actually a self-assigned issue (#45), but I have not gotten round to it... To understand for now, see the paper above (impatient readers may skip to the conclusions in Section 5)

So if I would reimplement that paper today, I would not use PI2. I would gather all the parameters of the DMPs in the sequence (related to shape and goal) and optimize them all together simultaneously in one large search space using a simple BBO algorithm, e.g. with UpdaterMean or UpdaterCovarDecay (https://github.com/stulp/dmpbbo/blob/master/python/bbo/updaters.py)

To answer your question. No, PI2 is not implemented in the repository, so you could not reproduce the exact results in that paper. But I would expect even better results with the code in this repository, for the reasons explained in the Paladyn paper.

If you want to implement what I suggest, I would make a class "DmpSequence", which has only a vector of DMPs as a member variable. DmpSequence should inherit from Parameterizable, forcing you to implement getParameterVectorAll, which would simply gather the parameters of each DMP in the sequence (including their goals). If that doesn't make sense, let me know and I can provide psuedo-code or do a quick&dirty implementation in a branch.

adamconkey commented 5 years ago

@stulp Thank you so much for your prompt and very informative response. I am hoping to start using your package, and I have a need to sequence multiple primitives. I'm unsure yet how much optimization at a sequence level will play a role in it, but it's good to know that is a possibility in your framework. Thanks again!

HongminWu commented 3 years ago

@stulp Thanks for your reply. Did you finish the class "DmpSequence" for sequencing multiple primitives?

stulp commented 3 years ago

The above explanation shows how I would go about it, but unfortunately I have had no time to actually implement this feature.

If you implement it for your project, I would highly appreciate it if you could make a pull request for that feature.