Open Shenzhi-Wang opened 2 years ago
I think you just need to smooth (each epoch contains 1 rollout which either succeeds or fails), can you average the returns over a moving window and plot it again? Our results were plotted with https://github.com/rail-berkeley/rlkit/blob/master/rlkit/visualization/plot_util.py
I've run
examples/iql/antmaze_finetune.py
, but the results are so bad, oscillating between 0 and 1 (as shown in the figure below), which are totally different from the result figures inexamples/iql/README.md
.