Open llewynS opened 1 year ago
The learn method will be changed significantly in the upcoming releases, tightening it's interface and simplifying the way in which it is called by the trainer. The dosctrings will change accordingly. Thanks for raising the question about modularity and independence on the buffer implementations! I will keep it in mind
Not really a coding issue. You assert in your paper that the code base is highly modular but the algorithms are very strongly tied to your implementation of a replay buffer. All of the static methods in the base policy require the replay buffer.
It would be non-trivial to decouple the algorithmic implementations from the replay.
I would like to recommend that you rename learn to _learn. learn without the underscore implies that it is a stand alone method that can be publicly called but, at least for DDPG and child classes, learn requires everything surrounding it in update.
The docstring of learn is also not indicative of this limitation.