Open isidentical opened 3 years ago
@martindurant today I was bitten by a similar issue in s3fs.core.S3FileSystem.isfile
. I had an s3 bucket like the (currently existing) bucket modin-datasets
and it had an empty file testing/
in it, i.e. an object at s3://modin-datasets/testing/
. There were also objects like modin-datasets/testing/test_data.parquet
.
When I list the contents of 'modin-datasets/testing/'
, I see my object at 'modin-datasets/testing/'
:
from fsspec.core import url_to_fs
fs, path = url_to_fs("s3://modin-datasets/testing/")
# this prints a list including 'modin-datasets/testing/', 'modin-datasets/testing/test_data.parquet', ...
fs.ls('modin-datasets/testing/')
but my filesystem doesn't recognize modin-datasets/testing/
as a file!
assert not fs.isfile('modin-datasets/testing/')
The consequence was that I spent a long time trying to debug why s3fs was trying to treat my directory as a file, until I finally realized it was just trying to open a file it correctly found, but then could no longer recognize as a file! Indeed, fs.open('modin-datasets/testing/').read()
gives me valid contents, b''
.
Is this a bug in s3fs? Is it a separate issue? How does it relate to #562?
There are a few ideas in conflict with this kind of thing, where a file and directory have exactly the same name, including the trailing "/". This situation could not, of course, happen on a posix FS.
The ls method is designed to provide a list of outputs, and so the same name can appear twice, with different details. However, info only fetches one of these, and isfile/dir uses info.
The code above first creates an empty file using that ends with a trailing slash. Then it tries to run s3fs's ls on the parent directory, which identifies that file as a directory;
Also the second and the third calls (info() and isdir()) claims it is a directory;
though when we try to do ls/walk etc it behaves like a file. The following is the result of
.ls('bucket/empty-dir/')
;instead I would have expected it to return an empty list