Closed EdoardoPona closed 2 years ago
Hey @EdoardoPona thanks for your interest in EvoJAX!
Many algorithms are sensitive to hyper-parameters, so hyper-parameter searches have been conducted and recorded in scripts/benchmarks/Readme.md. For example scripts/benchmarks/figures/ARS/cartpole_easy.png shows that ARS on Cartpole (Easy) task would have a good performance with init_stdev = 0.1
and lrate_init=0.1
.
Thus it is possible to reproduce such result by setting init_stdev: 0.1
and lrate_init: 0.1
in scripts/benchmarks/configs/ARS/cartpole_easy.yaml
and rerunning python train.py -config configs/ARS/cartpole_easy.yaml
. In doing so, I got the following result that is consistent with the table:
cartpole_easy: 2022-09-26 02:21:50,723 [INFO] [TEST] #tests=100, max=933.8828, avg=913.0647, min=168.9905, std=76.7758
Please let me know how it works for you and if there is any further questions!
Thank you for the clarification! All works correctly now.
My misunderstanding was due to the fact I thought the hyper-params contained in the .yaml files linked in the table such as scripts/benchmarks/configs/ARS/cartpole_easy.yaml were the final optimised ones, instead it reports init_stdev: 0.03
and lrate_init: 0.01
.
I did not read the heatmaps correctly.
Hello everyone.
I am currently currently trying to reproduce scores from the benchmarks, specifically for ARS, as I am implementing my own version native in jax, and wanted to compare with the wrapper already implemented.
For example, I cannot achieve the score posted in the benchmark table (902.107) for ARS on cartpole_easy.
running
python train.py -config configs/ARS/cartpole_easy.yaml
yields the following training logsI am not entirely sure if the result on the benchmark table is intended to be 720.5952 from
cartpole_easy: 2022-09-25 22:49:12,457 [INFO] Training done, best_score=720.5952
or the max score from the final test. Regardless, neither of these match the one posted on the benchmark table.
Am I doing something wrong to reproduce these scores? This makes me unable to compare my own implementation of the algorithm.
Thank you