thumbor-community / aws

Thumbor AWS extensions
MIT License
155 stars 70 forks source link

Reuse Bucket and Botocore objects to prevent cyclic references #112

Closed kkopachev closed 6 years ago

kkopachev commented 6 years ago

calling Botocore on each call to loader/storage seems expensive, given that tornado_botocore object then creates client object, which has a bunch of cyclically referenced objects inside. About 1-2k of objects total. Eventually they are getting garbage collected, but why bother running expensive gc, when reusing same objects can improve things.

This PR essentially implementing ideas I outlined in #109. Collected garbage went down from 1600-1700 objects to 34 on each call to loader.

@Bladrak thoughts?

closes #109

Bladrak commented 6 years ago

Hi, and thanks for contributing :) I'm wondering whether this doesn't duplicate a bit the logic within the session handler. Having a global bucket may remove the need for a global session. What do you think?

Bladrak commented 6 years ago

@kkopachev given I've merged the MAX_RETRY pull request, this one presents conflicts. I'd rather only publish one version with both fixes if that's ok with you?

kkopachev commented 6 years ago

@Bladrak that's totally fine with me releasing in one shot. I'll rebase this PR.

kkopachev commented 6 years ago

@Bladrak rebased on top of latest master, but discovered that thumbor for each request creates new storage/result storage instance, which ruins the whole idea :disappointed: We could make the Bucket objects for get/put/delete to be stored in global registry, but not sure how good idea is that. What do you think?

Bladrak commented 6 years ago

I'm not familiar with the global registry, is this handled by tornado? Given the storage/result storage instances are directly handled by thumbor, we may not be able to do much here... Besides maybe trying to provide a singleton for those?

kkopachev commented 6 years ago

sorry, being busy lately. Yes, I meant something like Singleton, but we'll have multiple instances, one per bucket+operation. I hope I could make these fixes soon.

kkopachev commented 6 years ago

@Bladrak I think this is ready. CircleCI issue seems unrelated to code changes, but rather CI config itself. Tests are passing locally.

In short, Bucket object is now singleton (for a given set of arguments) and contains botocore clients on it, so it does not reinitialize everything on every request.

Bladrak commented 6 years ago

LGTM, I'm trying to have the tests pass before merging. Do you have some metrics about the improvements of GC?

Bladrak commented 6 years ago

@kkopachev could you rebase against master to use latest CI workflow? This should hopefully stabilize your PR :)

kkopachev commented 6 years ago

all green now!

kkopachev commented 6 years ago

FYI, I have a counterpart PR opened on botocore repo https://github.com/boto/botocore/pull/1450 which supposed to fix circularly referenced objects creation on each request.

Bladrak commented 6 years ago

Great thanks! I'll try to publish a new version today