stevearc / pypicloud

S3-backed pypi server implementation
MIT License
507 stars 141 forks source link

DynamoCache.clear_all() breaks SSE and billing configuration #249

Open chludwig-haufe opened 4 years ago

chludwig-haufe commented 4 years ago

Hi,

we are running pypicloud on AWS with S3 storage backend, DynamoDB cache backend, and Secrets Manager auth backend. Due to a mandatory company policy, the data written by pypicloud into any of these services must be encrypted at rest on the server-side using a customer-managed CMK.

We created and configured the resources needed by pypicloud upfront and run pypicloud with an IAM role that has the necessary privileges; this works fine except when we reload the cache. With the default db.graceful_reload = false, this request calls DynamoCache.clear_all(). In this method, the DynamoDB tables are deleted and re-created. However, the re-created tables lack the SSE configuration required by our company policy.

Similarly, we want to use the on-demand billing mode for the DynamoDB tables (at least until we know the capacity required by our use of pypicloud). However, DynamoCache.clear_all() re-creates the tables in provisioned billing mode.

As a workaround, we can run pypicloud with db.graceful_reload = true; I don't expect the longer time needed to re-build the cache to be an issue in our case. But this won't work if an upgrade requires the re-build of the cache DB because of, say, schema changes.

Do you see any solution short of overriding the clear_all()method in a custom ICache implementation based on DynamoCache? AFAICT, the flywheel engine used by DynamoCache does not expose the necessary config options whence this might require some "hacks".

Regards, Christoph

stevearc commented 4 years ago

In practice, I think that db.graceful_reload = true will do everything you want. If you're worried about future upgrade potentially requiring table changes, I can say that that's very unlikely to happen any time soon. If you check the release history, there are 3 times when an upgrade has required a cache rebuild:

0.2.0 in 2014 0.5.0 in 2017 1.0.0 in 2017

At this point, the structure is pretty stable and unlikely to change. If I do ever release a future version that requires a DB rebuild, feel free to ping me again on this issue and I'll see if we can find a better solution for you. Does that adequately address your concern?

chludwig-haufe commented 4 years ago

Thanks. If you deem a schema change unlikely, then I am not going to argue. :-)

I realized the issue only when DynamoCache.clear_all() raised an error and left a broken cache because it cannot re-create a table in on-demand billing mode. (It sees a provisioned capacity of 0 and the attempt to create a provisioned table with capacity 0 fails.) Maybe you can mention in the docs that users who, for whatever reason, need to provision the AWS infrastructure themselves should avoid db.graceful_reload = false?

leorochael commented 4 years ago

I think this issue can be closed...