fsspec / swiftspec

fsspec implementation for OpenStack SWIFT
MIT License
4 stars 5 forks source link

Test for existing directory #5

Open observingClouds opened 2 years ago

observingClouds commented 2 years ago

Currently the exists method is not working as expected on directories. Swift does not have actual directories, but it would be great if the following would work:

Failing minimal example

import fsspec
fs = fsspec.filesystem("swift")
fs.lexists("swift://swift.dkrz.de/dkrz_948e7d4bbfbb445fbff5315fc433e36a/fsspec_test/bug01.zarr")

returns False although I expect it to return True.

On container-level and object-level the function returns as expected:

fs.lexists("swift://swift.dkrz.de/dkrz_948e7d4bbfbb445fbff5315fc433e36a/fsspec_test")
#True

fs.lexists("swift://swift.dkrz.de/dkrz_948e7d4bbfbb445fbff5315fc433e36a/fsspec_test/bug01.zarr/.zattrs")
#True
d70-t commented 2 years ago

This is a bit of a difficult task, as swift is not actually a filesystem but an object store. Currently "folders" of swiftspec are generated implicitly by grouping objects (the "files") by prefixes which are delimited by /. It would be possible to implement exists by checking if any object exists which has the given path as prefix. However, that might slow down all calls to exists for nonexistent objects.

Apart from this potential performance issue, it's in principle fine to have all of these three objects independently (e.g. fs.cat(path) would return independent things for all three):

It's not yet clear to me, which return values would be approrpiate for

fs.exists("swift://swift.dkrz.de/dkrz_948e7d4bbfbb445fbff5315fc433e36a/fsspec_test/bug01.zarr")

for all the potential combinations of these objects.

It seems like swiftbrowser and s3fs do create empty folder/-objects for all intermediate folders and use these to check for their existence, so probably we might want to do the same. Nonetheless, also the s3fs documentation says that strange errors regarding the existence of directories may occur.

Still, in this case, checking the existence of .../folder and .../folder/ might yield different results.