Should train_on_batch have option to omit backprop?

I am using Keras train_on_batch, with Tensorflow backend, to train a convolutional neural network on some high-resolution images. This is working fairly well, but because of the memory limitations of my Nvidia Quadro GP100 GPU I have to constrain either the batch size or the image resolution to below what might be optimal values. It occurred to me that if train_on_batch had an optional parameter to just accumulate gradients and defer the back propagation to a later call I could feed it larger images in smaller batches but achieve the effect of larger batches over several calls to train_on_batch.

Would this be a desirable enhancement to train_on_batch? Perhaps I could accomplish this myself by taking apart the Python Keras code for train_on_batch and creating my own version, but do you have any ideas on what might be the easiest way to achieve this in Keras 2.2.5 via just calls to the R interface?

Thanks.

rstudio / keras3

Should train_on_batch have option to omit backprop? #904