rstudio / keras3

R Interface to Keras
https://keras3.posit.co/
Other
833 stars 282 forks source link

Should train_on_batch have option to omit backprop? #904

Open dslate1 opened 4 years ago

dslate1 commented 4 years ago

I am using Keras train_on_batch, with Tensorflow backend, to train a convolutional neural network on some high-resolution images. This is working fairly well, but because of the memory limitations of my Nvidia Quadro GP100 GPU I have to constrain either the batch size or the image resolution to below what might be optimal values. It occurred to me that if train_on_batch had an optional parameter to just accumulate gradients and defer the back propagation to a later call I could feed it larger images in smaller batches but achieve the effect of larger batches over several calls to train_on_batch.

Would this be a desirable enhancement to train_on_batch? Perhaps I could accomplish this myself by taking apart the Python Keras code for train_on_batch and creating my own version, but do you have any ideas on what might be the easiest way to achieve this in Keras 2.2.5 via just calls to the R interface?

Thanks.

dfalbel commented 4 years ago

The recommended way to do this in TF2 is to build a custom training loop. For example: https://r-tensorflow.netlify.com/tutorials/advanced/customization/custom-training/#define-a-training-loop