I am trying to reproduce the results for the MPE environments as shown in the image using the default config files.
However, the reward curves for the speaker listener (continuous) environment is very different. Is it possible to share the experiment setup for those curves?
Hi, the experiment setups are detailed in README. To reproduce the results, I recommend using the tuned configs for those experiments, instead of the default configs. Hope it helps. :)
Hi,
I am trying to reproduce the results for the MPE environments as shown in the image using the default config files.
However, the reward curves for the speaker listener (continuous) environment is very different. Is it possible to share the experiment setup for those curves?
Thanks in advance!