Open selfint opened 3 years ago
Implementing a single Q learning trainer, and separate agents for:
Each one will have a separate implementation of a Q function and Q update function, but they are all trained in the same way.
Continuous actions Q agents aren't simple to learn. Maybe this article can help?
Implementing a single Q learning trainer, and separate agents for:
Each one will have a separate implementation of a Q function and Q update function, but they are all trained in the same way.