OOM errors for large datasets

tensorflow / decision-forests

A collection of state-of-the-art algorithms for the training, serving and interpretation of Decision Forest models in Keras.

Apache License 2.0

658 stars 108 forks source link

OOM errors for large datasets #218

Open piotrlaczkowski opened 5 months ago

piotrlaczkowski commented 5 months ago

If we load a sufficiently big dataset (using tf.data.dataset ==> TFDS in "not all in memory mode"), the instance crashes with an OOM error. Since we are iteratively using TFDS in batches, this should not be the case, right ... ?

Thus, we can conclude that the model tries to load the entire dataset into memory. Is this behavior normal? How can we scale this to big-data usage ?

THNX!