pfnet / pfio

IO library to access various filesystems with unified API
https://pfio.readthedocs.io/
MIT License
52 stars 21 forks source link

Fix `S3.rename` didn't correctly remove the old file #248

Closed belltailjp closed 2 years ago

belltailjp commented 2 years ago

In the current pfio, S3.rename just copies the content into a new file, not removing the original one.

Fixing rename

>>> with pfio.v2.from_url('s3://suzuo/writer/test') as fs:
...     with fs.open('foo.dat', 'wb') as f:
...         f.write(b'hello world')
... 
...     print('ls before', list(fs.list()))
...     fs.rename('foo.dat', 'bar.dat')
...     print('ls after', list(fs.list()))

ls before ['foo.dat']
ls after ['bar.dat', 'foo.dat']

This is because that rename calls remove to delete the source object with the key normalized, which is then normalized again inside the remove. So remove tries to remove the specified key that does not exist instead of the original source one that we actually need to delete. https://github.com/pfnet/pfio/blob/2.0.1/pfio/v2/s3.py#L523

This PR fixes the unnecessary normalization when calling remove in rename.

Fixing remove

One of the reasons why the incorrect rename behavior is missed would be because remove didn't raise anything even if the specified target doesn't actually exist. This S3 behavior is inconsistent to other filesystems. Since it seems to be a constraint comes from boto3 or perhaps S3 itself, I added an explicit existence check.

Test refactoring

In addition, since S3 test cases it getting increased, I DRYed fixtures to make each test case a little simpler.

The following conventional test case

@mock_s3
def test_foo():
    bucket = "test-dummy-bucket"
    key = "it's me!deadbeef"
    secret = "asedf;lkjdf;a'lksjd"
    with S3(bucket, create_bucket=True):
        with from_url('s3://test-dummy-bucket/',
                      aws_access_key_id=key,
                      aws_secret_access_key=secret) as s3:
            ...
        with open_url('s3://test-dummy-bucket/xxxx', 'rb'
                      aws_access_key_id=key,
                      aws_secret_access_key=secret) as f:
            ...

can be equivalently rewritten as:

def test_foo(s3_fixture):
    with from_url('s3://test-dummy-bucket/', **s3_fixture.aws_kwargs) as s3:
        ...
    with open_url('s3://test-dummy-bucket/xxxx', 'rb', **s2_fixture.aws_kwargs) as f:
        ...
belltailjp commented 2 years ago

/test

pfn-ci-bot commented 2 years ago

Successfully created a job for commit 83a36ac:

pfn-ci-bot commented 2 years ago

Successfully created a job for commit 83a36ac:

belltailjp commented 2 years ago

/test

pfn-ci-bot commented 2 years ago

Successfully created a job for commit 18c64d1:

pfn-ci-bot commented 2 years ago

Successfully created a job for commit 18c64d1:

belltailjp commented 2 years ago

/test

pfn-ci-bot commented 2 years ago

Successfully created a job for commit 7e5074d:

pfn-ci-bot commented 2 years ago

Successfully created a job for commit 7e5074d:

belltailjp commented 2 years ago

/test

pfn-ci-bot commented 2 years ago

Successfully created a job for commit e5d55ff:

pfn-ci-bot commented 2 years ago

Successfully created a job for commit e5d55ff: