Feature: abstract agent design and an example DQN implementation

mansicer / baselax

Baselax (Baselines + JAX) provides stable-baselines-style implementations of reinforcement learning (RL) algorithms with Google JAX framework.

https://baselax.readthedocs.io

Apache License 2.0

4 stars 1 forks source link

Feature: abstract agent design and an example DQN implementation #7

Closed mansicer closed 2 years ago

mansicer commented 2 years ago

This PR proposes an abstract BaseAgent design and an implementation of DQN based on this design. More details will be revealed with docstring format soon. Currently the TODO list contains:

[ ] A stable-baselines-style learn method to implement the run_loop function as currently examples show.
[ ] Return intermediate variables for the update method (e.g., chosen Q-values, target Q-values, different portions of the total loss).
[ ] Logging support for learn method such as TensorBoard and WandB.

mansicer commented 2 years ago

Some minor updates to file structures and dependencies:

rename the agent submodule to agents.
add stable-baselines3 requirement as a temporary solution to VecEnv (which should be replaced as its heavy dependency on PyTorch).
modify dm-haiku package source from GitHub repo to PyPI.

mansicer commented 2 years ago

As current agent-design branch contains many changes to the core package (dependencies, versions, etc.). We decide to merge this PR and leave support of the rest of features support as further developments.