liormizr / s3path

s3path is a pathlib extension for AWS S3 Service
Apache License 2.0
206 stars 39 forks source link

New Glob Algorithm Doesn't Work With Non Trivial Glob Patterns #120

Closed atfienberg closed 1 year ago

atfienberg commented 1 year ago

It appears the issue described in #115 is not completely fixed as of S3Path 0.4.1. While globbing nested directories with glob("*") works as expected, more complicated glob patterns that worked with the old glob algorithm do not work with the new one.

Steps to Reproduce

  1. Create a file within a nested directory in an S3 bucket. For example:
    % aws s3 ls s3://my-bucket/s3path-test/nested/further/
    2023-01-05 15:20:16         13 test.txt
  2. Create an S3Path instance representing the s3://my-bucket/s3path-test/nested/ directory:
    
    import s3path

path = s3path.S3Path("/my-bucket/s3path-test/nested/")


3. Call `list(path.glob("further/*"))`.

## Expected Result
The value `list(path.glob("further/*"))` should be a length one list containing an `S3Path` instance representing the file `test.txt`.

## Actual Result
`list(path.glob("further/*"))` is an empty list.

## Workaround
As described in #115, one can obtain the expected result by configuring `s3path` to use the old glob algorithm:

```python
import s3path

s3path.register_configuration_parameter(
    s3path.PureS3Path("/"), glob_new_algorithm=False
)

path = s3path.S3Path("/my-bucket/s3path-test/nested/")

print(list(path.glob("further/*")))

gives [S3Path('/my-bucket/s3path-test/nested/further/test.txt')].

I haven't looked extensively at which glob patterns work and which do not or at the effects of different nesting levels, but it appears an issue is present even if path represents a top-level S3 bucket. Consider the following snippet:

import s3path

path = s3path.S3Path("/my-bucket")

print(list(path.glob("s3path-test/*")))

This prints an unexpected value of [S3Path('/my-bucket/s3path-test')], whereas with the old glob algorithm it gives the expected value of [S3Path('/my-bucket/s3path-test/nested')].

Please let me know if you'd like any more information about this, and thank you!

liormizr commented 1 year ago

Working on a fix Thank you for the issue...

jlucier commented 1 year ago

@liormizr any plans to make a release with this fix?

liormizr commented 1 year ago

Hi @jlucier Sorry for the delay! Deployed now version 0.4.2

jlucier commented 1 year ago

Awesome, thanks @liormizr! All looks good on my end.