vlad17 / mve

MVE: model-based value estimation
Apache License 2.0
10 stars 0 forks source link

create stable online quantile clipping #389

Open vlad17 opened 6 years ago

vlad17 commented 6 years ago

See e.g, the following quote from one of my emails

One approach I discussed with Tuomas that might be considered "fair" and encourage some robustness is something like soft clipping all dimensions, where a soft clip is determined by a "soft" max of all prior observations, but the max can only go up by 10% of the range seen so far for that coordinate. This is just my ad-libbed pseudo-doubling scheme, and while this is a minor point I wanted to ask you if there's an established way of accomplishing this same thing.

This may increase robustness of continuous control systems. Currently, humanoid has clipped contact forces. See if we can remove those after this is implemented.