brandontrabucco / design-bench

Benchmarks for Model-Based Optimization
MIT License
80 stars 19 forks source link

Fix HopperController Evaluation #3

Closed young-geng closed 7 months ago

young-geng commented 2 years ago

Fixed HopperController evaluation by using stochastic policy and average each policy evaluation with 10 rollouts.

brandontrabucco commented 2 years ago

Hello Young, thanks for sending this in, I've been meaning to fix this issue!

Would it make sense to incorporate an option for choosing either stochastic or deterministic evaluation, so that we can continue to support the original variant of the task via the same task id, and register the stochastic version with a new task id?

For example, the stochastic version can be HopperController-Exact-v1.