fsspec / filesystem_spec

A specification that python filesystems should adhere to.
BSD 3-Clause "New" or "Revised" License
1k stars 353 forks source link

Custom Caching Filesystem does not work with a url string #1609

Open mpiannucci opened 4 months ago

mpiannucci commented 4 months ago

I am creating a custom redis caching filesystem and I found that opening a file with a custom string does not work because of this hardcoded sequence https://github.com/fsspec/filesystem_spec/blob/0bb3f26c412d7ad9b2d52a5c32265014709d1c1f/fsspec/core.py#L352 :

        bit = cls._strip_protocol(bit)
        if (
            protocol in {"blockcache", "filecache", "simplecache"}
            and "target_protocol" not in kw
        ):
            bit = previous_bit
        out.append((bit, protocol, kw))
        previous_bit = bit

Is there a way to work through this without forking fsspec? I am have registered my class and am trying to open like this:

import fsspec
from redis_fsspec_cache.sync import RedisCachingFileSystem

fsspec.register_implementation("rediscache", RedisCachingFileSystem)

with fsspec.open(
    "rediscache::s3://nextgen-dmac-cloud-ingest/nos/ngofs2/nos.ngofs2.fields.best.nc.zarr",
    mode="r",
    s3={"anon": True},
    rediscache={"redis_port": 6380},
) as f:
    print(len(f.read()))

Opening through the filesystem works fine. Thank you for any help you can give!

martindurant commented 4 months ago

Hm, I'm not entirely sure what that line does now - I suppose this is specific to filesystem-on-filesystem situations (as opposed to filesystem-on-file, like ZIP or explicit remote kwargs like referenceFS). I am open to suggestions! By this point, we do have access to the filesystem class cls, so special cases like the three listed could instead be makred with a class attribute.

mpiannucci commented 4 months ago

Yeha so that line is only relevant to filesystem on filesystems. When one of those cache impls is the name of the filesystem it passes the path through, but otherwise it strips it out.

martindurant commented 4 months ago

I'm happy to see any solution to this you think appropriate.