Open dluo96 opened 2 years ago
Hi!
When you do split=['train']
, then the return type is a list of tf.data.Dataset
. If you do split='train'
or use ds_train[0]
, then the assert should not fail.
Sorry, looking at the code, I think the return type is actually a dict {'train': tf.data.Dataset(...)}
, so using ds_train['train']
or split='train'
should work.
Hi @tomvdw, thanks for the reply! Your suggestions were very helpful.
It seems that tfds.load(...)
returns an instance of tensorflow.python.data.ops.dataset_ops._OptionsDataset
(a subclass of tf.data.Dataset
I believe) when split='train'
. Meanwhile, it seems to return a list when split=['train']
as you suggested in your first comment.
I think your first comment is correct based on a few experiments (see below) I ran:
Would you agree with this conclusion?
Experiment 1: Setting split='train'
import tensorflow as tf
import tensorflow_datasets as tfds
ds_train = tfds.load(
'iris',
shuffle_files=True,
split='train',
as_supervised=True,
)
assert isinstance(ds_train, tf.data.Dataset)
Experiment 2: Use ds_train[0]
import tensorflow as tf
import tensorflow_datasets as tfds
ds_train = tfds.load(
'iris',
shuffle_files=True,
split=['train'],
as_supervised=True,
)
assert isinstance(ds_train[0], tf.data.Dataset)
Experiment 3: Use ds_train['train']
ds_train = tfds.load(
'iris',
shuffle_files=True,
split=['train'],
as_supervised=True,
)
assert isinstance(ds_train['train'], tf.data.Dataset)
This returns the error
TypeError: '_OptionsDataset' object is not subscriptable
Short description When I load the iris dataset (https://www.tensorflow.org/datasets/catalog/iris) using the
tfds.load
function, the returned object is not atf.data.Dataset
object (which should be the case according to https://www.tensorflow.org/datasets/overview#tfdsload).Environment information
Operating System: macOS
Python version: 3.8
tensorflow-datasets
/tfds-nightly
version:tensorflow-datasets
4.5.2tensorflow
/tf-nightly
version:tensorflow
2.7.0Does the issue still exists with the last
tfds-nightly
package (pip install --upgrade tfds-nightly
) ? YesReproduction instructions
Link to logs 2022-03-05 13:27:35.574589: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. Traceback (most recent call last): File "structured/iris.py", line 12, in
assert isinstance(ds_train, tf.data.Dataset)
AssertionError
Expected behavior I expect
assert isinstance(ds_train, tf.data.Dataset)
to pass without AssertionError.Additional context N/A.