Open viotemp1 opened 4 years ago
Hi, thanks for the question.
This API was not really tested in the eager mode. It looks like TF does something different there and that is why there are problems when you try to copy data between devices.
Additionally, dataset based on DALI pipeline is infinite, so you need something like take(10)
to be able to do reduce
.
I was able to run something similar with tf.Session
with eager mode disabled. I will look more into this and let you know.
Hi, thanks for the answer. I did not know that dataset based on DALI pipeline is infinite. I also found out this by looping today with take(1). Anyhow, I tried disabling eager mode in TF 2.0 (with tf.compat.v1.disable_eager_execution()), but this break so many other things for me.
train_ds10 = train_ds.take(10) train_ds10 ... <TakeDataset shapes: ((128, 28, 28, 1), (128, 1)), types: (tf.uint8, tf.int64)> ... train_ds_len = trainds1.reduce(np.int64(0), lambda x, : x + 1) train_ds_len ... <tf.Tensor 'ReduceDataset_8:0' shape=() dtype=int64>
No error indeed, but no result either (on gpu or cpu) Thanks
To get the results in this mode you need something like:
with tf.Session() as sess:
print(sess.run(train_ds_len))
but I agree that this is rather a workaround than a solution. I'll look more into this and will update this thread when I know more.
@awolant Is there any update about using DALI with TF 2.0/2.1 eager mode? :-)
@ben0it8 - we are currently pursuing other project goals and there is no update regarding using DALI in the eager mode. As soon as we can get back to it we update you.
Hello, I'm trying to apply reduce over a TF dataset from DALIDataset, but it does not work (either GPU or CPU). What could be wrong?
tensorflow 2.0.0
tensorflow-addons 0.6.0
tensorflow-datasets 1.3.0
tensorflow-estimator 2.0.1
tensorflow-gpu 2.0.0
tensorflow-metadata 0.15.0
tensorflow-model-optimization 0.1.3
Keras 2.3.1
Keras-Applications 1.0.8
Keras-Preprocessing 1.1.0
nvidia-dali 0.15.0
nvidia-dali-nightly 0.17.0.dev20191202 nvidia-dali-tf-plugin-nightly 0.17.0.dev20191202
Here is part the code:
class TFRecordPipeline(Pipeline): def init(self, batch_size=1, device='gpu', num_threads=4, device_id=0, seed=0): super(TFRecordPipeline, self).init(batch_size, num_threads, device_id, seed) self.device = device self.input = ops.TFRecordReader(path = tfrecord, index_path = tfrecord_idx, features = { 'image_raw' : tfrec.FixedLenFeature((), tfrec.string, ""), 'label': tfrec.FixedLenFeature([1], tfrec.int64, -1), 'height': tfrec.FixedLenFeature([1], tfrec.int64, -1), 'width': tfrec.FixedLenFeature([1], tfrec.int64, -1), 'depth': tfrec.FixedLenFeature([1], tfrec.int64, -1) })
shapes = [ (BATCH_SIZE, 28, 28, 1), (BATCH_SIZE, 1)] dtypes = [ tf.uint8, # float32 tf.int64] def train_data_fn(batch_size=1, device='gpu', num_threads=4, device_id=0): pipeline = TFRecordPipeline(BATCH_SIZE, device=device, num_threads=num_threads, device_id = device_id) tf_dali_set = dali_tf.DALIDataset( pipeline=pipeline, batch_size=BATCH_SIZE, shapes=shapes, dtypes=dtypes, device_id=device_id)
mnist_set = mnist_set.map(lambda features, labels: ({'images': features}, labels))
train_ds = train_data_fn(batch_size=BATCH_SIZE, device='cpu', num_threads=1, device_id=0)
trainds.reduce(np.int64(0), lambda x, : x + 1)
Errors: On CPU:
InternalError Traceback (most recent call last)