fsspec / universal_pathlib

pathlib api extended to use fsspec backends
MIT License
240 stars 42 forks source link

Fail to instantiate s3 path with `#` character in URI #164

Closed alejoe91 closed 7 months ago

alejoe91 commented 10 months ago

Hi,

When I try to instantiate a UPath from an s3 path with a # in the URI, the path gets truncated.

I'm running this on Ubuntu 22.04

To reproduce:

from upath import UPath

# this is a public bucket
s3_uri = "s3://aind-open-data/ecephys_661279_2023-03-23_15-31-18/ecephys_compressed/experiment1_Record Node 104#Neuropix-PXI-100.ProbeA.zarr/"

s3path = UPath(s3_uri)
s3path

>>> S3Path('s3://aind-open-data/ecephys_661279_2023-03-23_15-31-18/ecephys_compressed/experiment1_Record Node 104')

Maybe similar to #144 ?

ap-- commented 10 months ago

Hi @alejoe91

That is indeed a bug related to the urllib based parsing of the fsspec URIs.

As a workaround for now you can provide only the base path of the bucket and join the other parts:

import upath

# for example like this (or using pth.joinpath, or `pth / "otherpath"`)
upath.UPath("s3://aind-open-data/ecephys_661279_2023-03-23_15-31-18/ecephys_compressed", "experiment1_Record Node 104#Neuropix-PXI-100.ProbeA.zarr/", anon=True)

Cheers, Andreas 😃

alejoe91 commented 10 months ago

Hi @ap--

Thanks for the prompt reply and for the workaround! :)

Looking forward to seeing this fixed in main too!

Cheers, Alessio