liormizr / s3path

s3path is a pathlib extension for AWS S3 Service
Apache License 2.0
208 stars 39 forks source link

Python 3.11.4 breaks `S3Path.glob` #136

Closed nlangellier closed 1 year ago

nlangellier commented 1 year ago

Hi @liormizr

Python 3.11.4 made a change to pathlib._RecursiveWildcardSelector that affects globbing with pathlib.Path and subsequently s3path.S3Path. Refs: Python issue #87695, Python PR #104362, and Python PR #104292. So now s3path.tests.test_path_operations::test_glob_old_algo fails on the recursive glob line assert list(S3Path('/test-bucket/').glob('**/*.test')) == [S3Path('/test-bucket/directory/Test.test')]. GitHub Actions produces the following error message:

______________________________ test_glob_old_algo ______________________________

s3_mock = None, enable_old_glob = None

    def test_glob_old_algo(s3_mock, enable_old_glob):
>       test_glob(s3_mock)

tests/test_path_operations.py:169: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
tests/test_path_operations.py:98: in test_glob
    assert list(S3Path('/test-bucket/').glob('**/*.test')) == [S3Path('/test-bucket/directory/Test.test')]
s3path.py:932: in glob
    yield from super().glob(pattern)
/opt/hostedtoolcache/Python/3.11.4/x64/lib/python3.11/pathlib.py:953: in glob
    for p in selector.select_from(self):
/opt/hostedtoolcache/Python/3.11.4/x64/lib/python3.11/pathlib.py:407: in _select_from
    for starting_point in self._iterate_directories(parent_path, is_dir, scandir):
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <pathlib._RecursiveWildcardSelector object at 0x7f33f7[24](https://github.com/nlangellier/s3path/actions/runs/5322268317/jobs/9638481749?pr=7#step:6:25)c910>
parent_path = S3Path('/test-bucket')
is_dir = <function S3Path.is_dir at 0x7f[33](https://github.com/nlangellier/s3path/actions/runs/5322268317/jobs/9638481749?pr=7#step:6:34)fa6[43](https://github.com/nlangellier/s3path/actions/runs/5322268317/jobs/9638481749?pr=7#step:6:44)d80>
scandir = <function S3Path._scandir at 0x7f33fa63c0e0>

    def _iterate_directories(self, parent_path, is_dir, scandir):
        yield parent_path
        try:
            with scandir(parent_path) as scandir_it:
                entries = list(scandir_it)
            for entry in entries:
                entry_is_dir = False
                try:
>                   entry_is_dir = entry.is_dir(follow_symlinks=False)
E                   TypeError: S3DirEntry.is_dir() got an unexpected keyword argument 'follow_symlinks'

So s3path.S3DirEntry.is_dir is now being fed a new argument follow_symlinks which it doesn't expect. I would create a PR to fix this, but the value being passed to follow_symlinks is False and everywhere in s3path we state that False is an invalid value for follow_symlinks. So not sure how to proceed with making s3path Python 3.11.4 compatible. WDYT?

liormizr commented 1 year ago

Nice, thank you PR: #137

liormizr commented 1 year ago

Merged to master I'll close this issue when we will deploy the next version

Tx

nlangellier commented 1 year ago

You're welcome!

liormizr commented 1 year ago

Deployed in version 0.5.0