Closed mansicer closed 2 years ago
Some minor updates to file structures and dependencies:
agent
submodule to agents
.stable-baselines3
requirement as a temporary solution to VecEnv
(which should be replaced as its heavy dependency on PyTorch).dm-haiku
package source from GitHub repo to PyPI.As current agent-design branch contains many changes to the core package (dependencies, versions, etc.). We decide to merge this PR and leave support of the rest of features support as further developments.
This PR proposes an abstract
BaseAgent
design and an implementation of DQN based on this design. More details will be revealed with docstring format soon. Currently the TODO list contains:learn
method to implement therun_loop
function as currently examples show.update
method (e.g., chosen Q-values, target Q-values, different portions of the total loss).learn
method such as TensorBoard and WandB.