The Pareto front generated by minecart is not optimal

I discovered that the Pareto front generated by the minecart environment is not optimal. I found this out by comparing the hypervolume of the Pareto front generated by GPI-LS with the one included in the environment itself. If the Pareto front generated by the environment was optimal, its hypervolume would be greater than or equal to the one resulting from GPI-LS. Instead, I observed that the hypervolume obtained by GPI-LS is greater.

Here is some code to reproduce it. I downloaded the Pareto front generated by the following run of GPI-LS: https://wandb.ai/openrlbenchmark/MORL-Baselines/runs/y6s3uaty Note that the same gamma is used in both solution sets, so the comparison is valid.

import pygmo as pg
import pandas as pd
import mo_gymnasium as mo_gym

env = mo_gym.make("minecart-v0")
pf = np.array(env.unwrapped.pareto_front(gamma=0.98, symmetric=True))

ref_point = np.array([-1, -1, -200])
hv_pf = pg.hypervolume(-pf).compute(-ref_point)

print("Minecart")
print(f"True PF HV: {hv_pf}")
print("----------")

FILE_NAME = "GPI_LS_front.csv"  # Put the filename of the CSV here.
found_vecs = pd.read_csv(FILE_NAME).to_numpy()
hv_pf = pg.hypervolume(-found_vecs).compute(-ref_point)
print(f'GPI-LS PF HV: {hv_pf}')

I think it would be worthwhile to also verify whether the convex hull generated by minecart is correct, but I did not check this myself.

LucasAlegre / morl-baselines

The Pareto front generated by minecart is not optimal #72