ponder-lab / Hybridize-Functions-Refactoring

Refactorings for optimizing imperative TensorFlow clients for greater efficiency.
Eclipse Public License 2.0
0 stars 0 forks source link

Calling next on an iterator over TF datasets should not have Python side-effects #279

Open khatchad opened 11 months ago

khatchad commented 11 months ago

Consider the following code:

# From  https://www.tensorflow.org/guide/function#using_python_iterators_and_generators

import tensorflow as tf

@tf.function
def good_consume_next(iterator):
  # This is ok, iterator is a tf.data.Iterator
  tf.print("Value:", next(iterator))

ds = tf.data.Dataset.from_tensor_slices([1, 2, 3])
iterator = iter(ds)
good_consume_next(iterator)
good_consume_next(iterator)
good_consume_next(iterator)

Above, iterator is over a TF dataset. Calling next() on the iterator moves its cursor over the container. Thus, that would not be considered a Python side-effect, but only because the underlying container is a TF dataset container.

Regression

I believe is a type inferencing problem. The function call to iter() above will return a certain kind of iterator depending on its argument's type:

>>> type(iter(tf.data.Dataset.from_tensor_slices([1, 2, 3])))
<class 'tensorflow.python.data.ops.iterator_ops.OwnedIterator'>