Open syzymon opened 4 years ago
It turns out that function trax.data.tf_inputs.download_and_prepare
won't download the dataset in case of imagenet64 - it has to be downloaded manually, as per t2t documentation in imagenet.py data generator:
"""Image generator for Imagenet 64x64 downsampled images.
It assumes that the data has been downloaded from
http://image-net.org/small/*_32x32.tar or
http://image-net.org/small/*_64x64.tar into tmp_dir.
One more issue that I had to resolve before I was able to run reformer-imagenet64 gin config successfully is changing image files read to binary mode (also in t2t imagenet_pixelrnn_generator
):
from
with tf.gfile.Open(filename, "r") as f:
to
with tf.gfile.Open(filename, "rb") as f:
is this change required to load images from the dataset?
Description
Imagenet64 dataset from tensor2tensor used in this gin config: https://github.com/google/trax/blob/master/trax/supervised/configs/reformer_imagenet64.gin
seems to have some loading issues. I tried to run this config on Google Colab: https://colab.research.google.com/drive/1ysEQYOaIspHPBVu6S9jOxc7BkE2oDrh0
and ran into:
tensorflow.python.framework.errors_impl.NotFoundError: /root/tensorflow_datasets/download/train_64x64; No such file or directory
(more detailed stack trace provided below).For reference, gin configs that use different datasets from t2t, like this most recent one: https://github.com/google/trax/blob/master/trax/supervised/configs/transformer_lm_cnndailymail.gin
worked correctly in the same colab. When trying a different gin config with imagenet224 also from t2t failed in a similar way as this imagenet64.
Is this a known issue?
Environment information
For bugs: reproduction and error logs