liormizr / s3path

s3path is a pathlib extension for AWS S3 Service
Apache License 2.0
206 stars 39 forks source link

Feature: `VersionedS3Path` #139

Closed chnpenny closed 11 months ago

chnpenny commented 1 year ago

This PR addresses issue #126 by

  1. Adding a VersionedS3Path class to the public API
  2. Adding a st_version_id property to StatResult
  3. Adding automated tests for the new functionality in tests/test_path_operations.py
  4. Adding appropriate documentation to README.rst and docs/interface.rst

Additional notes

chnpenny commented 1 year ago

@liormizr this PR is ready. Can we get a review please?

liormizr commented 12 months ago

@chnpenny how do you want to continue? I'm waiting for this PR for a big version Do you want to continue or do you want me to take when you did as an example and write my own version of this feature?

nlangellier commented 12 months ago

@chnpenny how do you want to continue? I'm waiting for this PR for a big version Do you want to continue or do you want me to take when you did as an example and write my own version of this feature?

@liormizr apologies. we have been quite busy until now. We will work on this now.

nlangellier commented 11 months ago

@liormizr thanks for the review. i implemented your suggestions and this is ready for another round of review.

nlangellier commented 11 months ago

Thanks for the comment about relative paths @liormizr. This is again ready for review.

liormizr commented 11 months ago

thanks @chnpenny & @nlangellier I think that we are almost done Two notes

  1. after adding PureVersionedS3Path and VersionedS3Path can we update the doc's accordingly? I mean in the docs/interface.rst file
  2. In PureVersionedS3Path why didn't we in the end go with changing the _from_parts method?
nlangellier commented 11 months ago

thanks @chnpenny & @nlangellier I think that we are almost done Two notes

  1. after adding PureVersionedS3Path and VersionedS3Path can we update the doc's accordingly? I mean in the docs/interface.rst file
  2. In PureVersionedS3Path why didn't we in the end go with changing the _from_parts method?

Thank you @liormizr

Regarding the first question: Yes forgot to update that. Will do that today.

Regarding the second point: We felt it was unnecessary to do so, but perhaps we missed an important detail as to why you suggested to override the _from_parts method? One of our goals in this PR was to avoid code duplication when possible. _VersionedS3Accessor required a fair bit of code duplication, which is OK. But as another example, you had originally suggested to define PureVersionedS3Path.from_uri as:

def from_uri(cls, uri: str, *, version_id: str) -> PureVersionedS3Path:
    if not uri.startswith('s3://'):
        raise ValueError('Provided uri seems to be no S3 URI!')
    return cls(uri[4:], version_id=version_id)

but we made the decision to recycle the code from PureS3Path.from_uri by implementing PureVersionedS3Path.from_uri as:

def from_uri(cls, uri: str, *, version_id: str) -> PureVersionedS3Path:
    self = PureS3Path.from_uri(uri)
    return cls(self, version_id=version_id)

The idea there was that if a future change is made to PureS3Path.from_uri, the same change will very likely be needed in PureVersionedS3Path.from_uri as well and thus the change will only need to be made in one place. In a similar manner of thinking we implemented PureVersionedS3Path.__rtruediv__ by trying to minimize code duplication and recycle code where possible. By defining it as

def __rtruediv__(self, key):
    if not isinstance(key, (PureS3Path, str)):
        return NotImplemented
    new_path = super().__rtruediv__(key)
    new_path.version_id = self.version_id
    return new_path

then we let pathlib.PurePath.__rtruediv__ "do the heavy lifting" so to speak and we only add the additional functionality needed on top of that. If you don't like this approach or there is some subtle bug we are not thinking of, please let us know and we'll be happy to implement your idea of additionally overriding _from_parts.

liormizr commented 11 months ago

For the first version we can go with your approach Lets see maybe it's just my preference

Well done and thank you for the hard work

nlangellier commented 11 months ago

@liormizr we hadn't finished updating the documentation yet. We'll open a separate PR for that later today