FedUni / MORL

Multi-Objective Reinforcement Learning components built on top of RL glue components
Apache License 2.0
28 stars 1 forks source link

How to get the true Pareto Front of mountain car environment? #3

Open HONG-ZI opened 1 year ago

HONG-ZI commented 1 year ago

Paper [1] give the true Pareto Front of mountain car environment, but it did not present the corresponding computing process. Is the true Pareto Front computed by “Exhaustion”?

[1] P. Vamplew, J. Yearwood, R. Dazeley, and A. Berry, “On the Limitations of Scalarisation for Multi-objective Reinforcement Learning of Pareto Fronts,” in AI 2008: Advances in Artificial Intelligence, vol. 5360.

Amp1874 commented 1 year ago

If I remember correctly it was a depth-first search, with leafs terminated if they were inferior to states which had previously been found earlier in the tree. We also used a similar approach to find the Pareto front for the MOPuddleWorld problem, but I no longer trust those results - several people have reported being unable to reproduce them. Unfortunately the code was lost when that research assistant's contract ended.

As a result we've largely moved away from comparing results against the "true front", and instead use other metrics like hypervolume, with appropriately chosen reference points. The exception is the Deep Sea Treasure problem, where it is simple to calculate the actual front.