world-modelz / dreamax

A scalable Dreamer implementation in JAX
MIT License
11 stars 2 forks source link

Restore overall base training performance of April 7 repo state (before refactoring) #20

Open andreaskoepf opened 2 years ago

andreaskoepf commented 2 years ago

Working commit: https://github.com/world-modelz/dreamax/commit/3e0ac35f7e44946a26430ca489e53ed415c84aa3

Pendulum after ~30min training at 100k env steps and >>500 average return.

andreaskoepf commented 2 years ago

Baseline run of old version: Tensorboard events file, charts (screenshot).

andreaskoepf commented 2 years ago

I reverted the main branch (--hard & force push) back to the Apr 7th working state. The refactoring commits have been moved to xmaster_refactor branch. I would suggest to treat xmaster_refactor as a temporary branch that is sealed and to add changes in a clean way (in multiple steps) back to the main branch.

General notes: