thumbor-community / aws

Thumbor AWS extensions
MIT License
155 stars 70 forks source link

Refactor to use aiobotocore, python3 and tornado 7 #142

Closed kkopachev closed 2 years ago

Bladrak commented 4 years ago

Thanks for this huge update! :) This should probably go to a new branch on this repo, while Thumbor's 7th version is in alpha. I'm also wondering how we can test for compatibility with previous versions (noticeably for potential issues on image paths). This will probably also require some performance tests as have been done in the current version, as you're removing the presign_loader which was mostly done to answer perf issues.

So to sum up:

kkopachev commented 4 years ago

I saw aiobotocore was updated to v1.0 and decided to update it here. They introduced braking change in order to make credentials refreshing asynchronous. Changed is that create_client method is now has to be called as context manager. If we're going to use it that way, then client would be recreated for each outgoing request to s3. Client would also create a pool of aio http connections and kill it right after request. I am struggling to find a decent way to make one client and reuse it (singleton) for the life of thumbor. That's what Bucket object was meant for. Credentials would live on a shared session object, there is no problem sharing them across requests. The problem is with clients and http pools. I opened an issue with aiobotocore to solicit some guidance on the situation. One possible solution would be to call __aenter__/__aexit__ methods manually and store client in a loader/storage instance property, but that looks very magical.

Would love if somebody here can also take a look.

heynemann commented 4 years ago

I would comment but have zero context in aiobotocore. :(

amanagr commented 4 years ago

Hey, Just wanted you guys to know that I am excited about this PR since we in https://github.com/zulip use this package and it's a potential blocker for us to upgrade thumbor from python 2 to 3. Let me know if I can help in any way.

timabbott commented 4 years ago

Any update on this? It's blocking Zulip testing the upgrade to Thumbor 3.

kkopachev commented 4 years ago

Hey guys, I've pushed changes to accommodate latest aiobotocore version. Take a look please. I know we should be closing the connection pool on thumbor exit, but that does not seem to be an option, so I left it open, hoping python will clean up all remaining resources on exit. Might be not ideal solution.

@Bladrak I've submitted to new branch. Backward compat should be covered by tests which had rather small changes. Performance should probably be tested, not sure how exactly. I've removed presigning loader since in theory it should not be faster, since all aws requests are already async. Also theoretically this branch should be faster in case using instance metadata or ecs task metadata to fetch credentials, since such request (and refresh request too) would now be async.

Please CR.

@heynemann think it's worth adding async initializer/finalizer to base storages to be called on thumbor startup/shutdown?

amanagr commented 4 years ago

I appreciate the progress, but we are blocked on this PR before upgrading to python3 for thumbor. We would appreciate its completion.

Bladrak commented 4 years ago

@amanagr if you want to help, a good way might be to test it on your side. Once we have some feedback we may release it as an alpha.

amanagr commented 4 years ago

I am trying to test this on a s3 bucket, I received the following error when uploading the image:

tc_aws/aws/bucket.py", line 61, in get
    return await self._client.get_object(
AttributeError: 'ClientCreatorContext' object has no attribute 'get_object'

Any idea what might be wrong? (I think this could be related to the version of aiobotocore installed on my system which is 1.0.7 right now.)

Documentation says the method is correct though, https://github.com/aio-libs/aiobotocore/blob/1.0.7/docs/index.rst

amanagr commented 4 years ago

I am not sure what you guys are doing here: (bucket.py)

            _cm_type = type(cm)

            self._client = await _cm_type.__aenter__(cm)

why not directly use cm?

amanagr commented 4 years ago

I was trying to just upgrade this project from python2 to python3 without the move to aiobotocore. Then I saw https://github.com/nanvel/tornado-botocore/issues/18 and I wonder if the move to aiobotocore is necessary since tornado-botocore is deprecated and doesn't play nice with the latest dependencies in the thumbor ecosystem (thumbor + libthumbor).

Bladrak commented 2 years ago

Closing in favor of #147