fsspec / universal_pathlib

pathlib api extended to use fsspec backends
MIT License
251 stars 44 forks source link

Provide support for relative file uris? #142

Open ap-- opened 1 year ago

ap-- commented 1 year ago

This is specifically regarding URIs of the form "file:path/to/somewhere"

We should provide a correct error message for now, and in case pathlib or we decide to support parsing relative path uri's we can switch, see:

_Originally posted by @ap-- in https://github.com/fsspec/universal_pathlib/issues/108#issuecomment-1696410259_

joouha commented 1 year ago

Hi!

Thanks for opening this issue - I had to take some time off and never got round to doing it myself.

I'm mainly interested in data uris (which I want to support in euporie), which consist of a scheme and a rootless path, with no authority part, query string, or fragment, e.g.:

data:image/jpeg;base64,/9j/4AAQSkZJRgABAgAAZABkAAD

\__/ \__________________________________________/
 |                       |
scheme                  path

I think you previously said you don't want to support these in universal_pathlib, but it would be nice to be able to write a UPath subclass and integrate them via the registry system.

This worked previously, but changing to using fsspec.core.split_protocol to extract the uri scheme (protocol) broke this, as the current implementation assumes the path is absolute, or that an authority is present.

ap-- commented 1 year ago

In the draft-PR #152 for Python 3.12 support I kept your data uri usecase in mind and switched to a non-fsspec split_protocol implementation. This will at some point become the default implementation for all python versions.

And yes, while I still think that a data UPath class shouldn't necessarily be shipped in universal_pathlib, I fully agree that we shouldn't make it hard for a custom implementation to interpret the path part of the uri in whatever way possible.

So in the meantime, since euporie is one of the popular dependents I'd be happy to keep your custom use case working. If you could make a PR that adds a minimal 3rd-party test (like in https://github.com/fsspec/universal_pathlib/blob/main/upath/tests/third_party/test_pydantic.py) for the data UPath usecase in euporie, that would be great. (As I understand these tests should then fail for the current version)

I can then start working on backporting some of the stuff from #152 to the current implementation.