the training of CMA-ES shows high average reward but when you just check the model against the log the rewards are practically zero

hardmaru / WorldModelsExperiments

World Models Experiments

608 stars 171 forks source link

the training of CMA-ES shows high average reward but when you just check the model against the log the rewards are practically zero #11

Closed itabhiyanta closed 5 years ago

itabhiyanta commented 5 years ago

Hi @hardmaru

Thanks for posting this repo. i have a strange issue I see a very promising curve for the training of my CMA-ES model however i cannot replicate the results when i execute the following command.

python3.5 model.py log/filewiththe best stats.json dispatch

I am using a custom environment.

I also wish to ask you something about the number of processors for the training of the CMA-ES model. I used 16 processors and also 48 processors (couldn't use 64 processors as then i run out of memory). Do you think reducing the number of processors for training of the CMA-ES model will have some adverse effect?

Kindly advise. Rohit

hardmaru commented 5 years ago

You may have forgotten an extra flag (render/norender)

python model.py render log/carracing.cma.16.64.best.json

Chk out blog post http://blog.otoro.net/2018/06/09/world-models-experiments/

On Thu, Dec 20, 2018 at 12:28 AM itabhiyanta notifications@github.com wrote:

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/hardmaru/WorldModelsExperiments/issues/11, or mute the thread https://github.com/notifications/unsubscribe-auth/AGBoHjqenYIBXvYR2M7rWzbP4fmbW34Vks5u6lsKgaJpZM4Zad4Y .

itabhiyanta commented 5 years ago

yep that was it. i didn't use it thinking that since i do not use the gym environment in general it doesn't apply to me. thanks

hardmaru commented 5 years ago

cool. I'd be interested to see any results for custom environments, looking forward to see your publications in the future.