# Steps to reproduce:
t2t-datagen --tmp_dir=/tmp --problem=translate_ende_wmt32k --data_dir=gs://my_bucket/path
# Error logs:
Traceback (most recent call last):
File "/.virtualenvs/env3/bin/t2t-datagen", line 27, in <module>
tf.app.run()
File "/.virtualenvs/env3/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 126, in run
_sys.exit(main(argv))
File "/.virtualenvs/env3/bin/t2t-datagen", line 23, in main
t2t_datagen.main(argv)
File "/.virtualenvs/env3/lib/python3.6/site-packages/tensor2tensor/bin/t2t_datagen.py", line 171, in main
tf.gfile.MakeDirs(FLAGS.data_dir)
File "/.virtualenvs/env3/lib/python3.6/site-packages/tensorflow/python/lib/io/file_io.py", line 374, in recursive_create_dir
pywrap_tensorflow.RecursivelyCreateDir(compat.as_bytes(dirname), status)
File "/.virtualenvs/env3/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 519, in __exit__
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: 'object' must be a non-empty string. (File: gs://my_bucket/)
Fixes/Hacks
Not recommended but a good enough to isolate the cause.
This issue can be fixed by manually creating the bucket folder in GCS and commenting out tf.gfile.MakeDirs(FLAGS.data_dir).
Instead of using multi-level bucket paths, just use the root path - gs://my_bucket.
The error comes from the pywrap_tensorflow binary and cannot be debugged further without going into Tensorflow source I believe.
Description
Using Google Cloud Storage for
--data_dir
fails. Using gs paths likegs://my-bucket/path/to/folder
fails.NOTE: This is most likely a Tensorflow bug than Tensor2Tensor bug but keeping it here for the record. ...
Environment information
For bugs: reproduction and error logs
Fixes/Hacks
Not recommended but a good enough to isolate the cause.
This issue can be fixed by manually creating the bucket folder in GCS and commenting out
tf.gfile.MakeDirs(FLAGS.data_dir)
.Instead of using multi-level bucket paths, just use the root path -
gs://my_bucket
.The error comes from the
pywrap_tensorflow
binary and cannot be debugged further without going into Tensorflow source I believe.