fsspec / universal_pathlib

pathlib api extended to use fsspec backends
MIT License
211 stars 36 forks source link

UPath handling of paths with double slashes `//` #144

Open ap-- opened 9 months ago

ap-- commented 9 months ago

_Originally posted by @rdbisme in https://github.com/fsspec/universal_pathlib/issues/80#issuecomment-1721542132_

I have also another problem, but not sure if it's something that requires an additional issue.

import os

import boto3
from upath import UPath

# Mocked AWS Credentials for moto
os.environ["AWS_ACCESS_KEY_ID"] = "testing"
os.environ["AWS_SECRET_ACCESS_KEY"] = "testing"
os.environ["AWS_SESSION_TOKEN"] = "testing"

conn = boto3.resource(
    "s3", region_name="us-east-1", endpoint_url="http://localhost:5000"
)
conn.create_bucket(Bucket="mybucket")

conn.Object("mybucket", "//key").put(Body=b"Something")

up = UPath("s3://mybucket", endpoint_url="http://localhost:5000")

print([f for f in up.rglob("*")])
file = next(up.rglob("*"))
print(file.is_file())
print(file.is_dir())
print(file.exists())
[S3Path('s3://mybucket/key')]
False
False
False

The double slash is being "merged" in one, and creates a weird ghost file.

Doing manually UPath("s3://bucket//key") works as expected instead.

ap-- commented 9 months ago

Hi @rdbisme

Thanks for reporting! I created a new issue to track this separately. Would you be willing to create a standalone test in upath/tests/implementations/test_s3.py that reproduces this issue?