tensorflow / io

Dataset, streaming, and file system extensions maintained by TensorFlow SIG-IO
Apache License 2.0
704 stars 284 forks source link

File system scheme 's3' not implemented #1731

Open greper opened 1 year ago

greper commented 1 year ago

An error occurred when I ran tensorboard using s3 as logdir

tensorboard --logdir=s3://mnist/log --port=6123

I find this method is the cause of the problem

tf.io.gfile.exists("s3://minst/log/xxxxxx") 
import os

os.environ.setdefault("S3_ENDPOINT", "https://xxxx")
os.environ.setdefault("AWS_ACCESS_KEY_ID", "xxx")
os.environ.setdefault("AWS_SECRET_ACCESS_KEY", "xxxx")
os.environ.setdefault("AWS_REGION", "us-east-1")

import tensorflow as tf
import tensorflow_io as tfio

gfile = tf.io.gfile.GFile("s3://minst/log/xxxxxx")
print(gfile)
# ↑ ↑ ↑ ---------it's ok

v = tf.io.read_file('s3://minst/log/xxxxx')
print(v) 
# ↑ ↑ ↑ ---------it's ok

tf.io.gfile.exists("s3://minst/log/xxxxxx")
# ↑ ↑ ↑ ---------error

tensorboard.run_main()
# ↑ ↑ ↑ ---------error

os.system("tensorboard --logdir=s3://mnist/log --port=6123")
# ↑ ↑ ↑ ---------error

# will got same error
Traceback (most recent call last):
  File "D:\Users\xiaojunnuo\Desktop\tensorboard\venv\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 288, in file_exists_v2
    _pywrap_file_io.FileExists(compat.path_to_bytes(path))
tensorflow.python.framework.errors_impl.UnimplementedError: File system scheme 's3' not implemented (file: 's3://minst/log/xxxxxxxxxxxxxxxx')
greper commented 1 year ago

https://github.com/tensorflow/tensorflow/blob/359c3cdfc5fabac82b3c70b3b6de2b0a8c16874f/tensorflow/python/lib/io/file_io.py#L270-L272

If 'tf.io.gfile.exists' does not support s3, then what interface can determine whether s3 files and directories exist

dilverse commented 1 year ago

I am also running into this issue, any updates on this?

dilverse commented 1 year ago

@greper I tried with the latest tensor board build and that seems to work fine along with tensor flow-io, did you try that?

dilverse commented 1 year ago

@greper I tried with the latest tensor board build and that seems to work fine along with tensor flow-io, did you try that?

On further digging I found that there is a tight dependency between the tensorflow version WRT tensorflow-io. Here is the compatibility table, this is going to be trickier during each tensorflow/tensorboard upgrade without any backward compatibility

slittle-twilio commented 1 year ago

@dilverse I had the same issue because I did not pin the version of tensorflow-io when using tensorflow 2.10. Pinning tensorflow-io to version 0.27.0 solved the issue for me. I would have thought that this dependency should have been taken care of from the package installer.

rwo-work commented 1 year ago

I have the same Problem only when using Windows. I am using Tensorflow 2.11.0 and tensorflow-io 0.31.0. Under Linux i can use tf.io.gfile.glob perfectly fine. Under Windows however, i get the File system scheme 's3' not implemented error. Is there support for the s3 filesystem under Windows?