DQN Minibatch Option - Githubissues

Sure. I don't know of any such reason, though I would guess it would only be helpful if the model activations are using an huge amount of memory (i.e., much bigger than your typical RL model).

I haven't tried this out but the patch would be something like the following:

diff --git a/rllib/agents/dqn/dqn.py b/rllib/agents/dqn/dqn.py
index 76bc21817..d613ecef7 100644
--- a/rllib/agents/dqn/dqn.py
+++ b/rllib/agents/dqn/dqn.py
@@ -276,7 +276,10 @@ def execution_plan(workers, config):
     post_fn = config.get("before_learn_on_batch") or (lambda b, *a: b)
     replay_op = Replay(local_buffer=local_replay_buffer) \
         .for_each(lambda x: post_fn(x, workers, config)) \
-        .for_each(TrainOneStep(workers)) \
+        .for_each(ComputeGradients(workers))  \
+        .batch(num_microbatches)  \
+        .for_each(AverageGradients())  \
+        .for_each(ApplyGradients(workers))) \
         .for_each(update_prio) \
         .for_each(UpdateTargetNetwork(
             workers, config["target_network_update_freq"]))

ray-project / ray

DQN Minibatch Option #8870

Describe your feature request