liormizr / s3path

s3path is a pathlib extension for AWS S3 Service
Apache License 2.0
206 stars 39 forks source link

Resources given to `register_configuration_parameter()` are not properly registered #123

Closed redevined closed 1 year ago

redevined commented 1 year ago

Example

I have an active AWS session with a temporary token. The following snippet works:

path = S3Path('/mybucket')
print(list(path.iterdir())) # works

After some time, the old token expires, so a new one is acquired and registered within s3path.

boto3.setup_default_session(...)
s3 = boto3.resource('s3')
s3path.register_configuration_parameter(path, resource = s3)

However, the previous command now fails:

path = S3Path('/mybucket')
print(list(path.iterdir())) # fails, due to not authorized

After closer inspection I found out that the resource accessed by S3Path objects is the same before and after refreshing the session:

path = S3Path('/mybucket')
print(list(path.iterdir())) # works
old_resource, _ = path._accessor.configuration_map.get_configuration(path)

boto3.setup_default_session(...)
s3 = boto3.resource('s3')
s3path.register_configuration_parameter(path, resource = s3)

new_resource, _ = path._accessor.configuration_map.get_configuration(path)
new_resource is s3 # False
new_resource is old_resource # True

I believe this is related to the usage of the lru_cache decorator for get_configuration() inside the _S3ConfigurationMap class.

Workaround

At present I can work around this issue by manually invalidating the cache after calling register_configuration_parameter():

path._accessor.configuration_map.get_configuration.cache_clear()

Suggested fix

The functools documentation states the following when working with lru_cache-decorated methods:

The above example assumes that the [attribute] never changes. If the relevant instance attributes are mutable, the cached_property approach can’t be made to work because it cannot detect changes to the attributes. To make the lru_cache approach work when the [attribute] is mutable, the class needs to define the __eq__() and __hash__() methods so that the cache can detect relevant attribute updates: [...]

I therefore recommend to implement the _S3ConfigurationMap.__hash__() and _S3ConfigurationMap.__eq__() magic methods, while considering the instance attributes attributes, resources and general_options.

If desired, I can submit a pull request as well.

liormizr commented 1 year ago

Hi @redevined Thanks for the issue I created a PR with a fix

liormizr commented 1 year ago

I'll close this issue when we will deploy new version

liormizr commented 1 year ago

Deployed in version 0.4.2