LucasAlegre / morl-baselines

Multi-Objective Reinforcement Learning algorithms implementations.
https://lucasalegre.github.io/morl-baselines
MIT License
295 stars 47 forks source link

Performance report issue tracker #43

Open ffelten opened 1 year ago

ffelten commented 1 year ago

This issue is there to allow to coordinate who is running what and see a more or less live update of the performances being uploaded to openrlbenchmark.

See all runs: openrlbenchmark

How to help?

Mark your name on an algo/env combination and state the runs as you make them.

Run command with benchmark script:

python benchmark/launch_experiment.py --algo <ALGO> --env-id <ENV_ID> --num-timesteps 1000000 --gamma 0.99 --ref-point ... --auto-tag True --wandb-entity openrlbenchmark --seed <0 to 9> --init-hyperparams ... --train-hyperparams ...

Deterministic envs

For all deterministic environments, we push the learning rate to 1.0 and exploration rate higher since it's all about exploring fast in these cases. Our deterministic envs:

Multi-policy

✅ CAPQL

✅ GPI-LS continuous

--algo gpi_ls_continuous

✅ GPI-PD continuous

--algo gpi_pd_continuous

✅ GPI-LS discrete

--algo gpi_ls_discrete

✅ GPI-PD discrete

--algo gpi_pd_discrete

✅ Envelope

--algo envelope

✅ PGMORL

--algo pgmorl

PCN

--algo pcn

✅ PQL (deterministic envs)

--algo pql

✅ GPI-LS tabular

--algo gpi-ls --init-hyperparams "use_gpi_policy:True"

✅ MPMOQL

--algo mpmoql

✅ OLS

--algo ols --init-hyperparams "weight_selection_algo:'ols'" "epsilon_ols:0.0"

Single-policy

MOQL

EUPG