Keeper-Security / Commander

Keeper Commander is a python-based CLI and SDK interface to the Keeper Security platform. Provides administrative controls, reporting, import/export and vault management.
https://www.keepersecurity.com/commander.html
MIT License
184 stars 74 forks source link

refactor: Extracted common Cache class for KeeperParams and made it pickable #1178

Open dabla opened 8 months ago

dabla commented 8 months ago

Hello,

We are using the keepercommander library as a secrets backend in Airflow, so for that we implemented a custom SecretsBackend which allows us to use Keeper as a backend for all our Airflow connections. To do so we directly use the KeeperParams class in python, not through CLI, which works great.

The issue we discovered is that when we have a lot of simultanious tasks, the Keeper API get's called to much and we are being throttled meaning tasks are stuch until Keeper allows us to call the API again, which is logical. Still we didn't understand why the API got called so much as the results are being cached in the KeeperParams class.

So after investigation we discovered that even though the results are being cached, it doesn't help in our case because we are running in a multi-processing environment using Kubernetes which not only means threads don't share state but also that the process running the task could be running on another pod.

So our solution would be to pickle (and of course encrypt the pickled result) the cached records of the KeeperParams and store it on the shared filesystem, that way each time we have to instantiated a new KeeperParams, we first check if there is a pickled file available and if so load it. That way we have the cached records instantly available without the need to reconnect to the Keeper API.

The first thing I did is move al cache related dictionnaries in the KeeperParams to a dedicated class named Cache which is pickable but also comparable which makes assertions in the tests easier. I also made the BaseFolderNode comparable. The I implemented the reduce method for KeeperParams which only pickles non sensitive data and the cache.

At the moment we have a custom solution which pickles the required caches from the KeeperParams class, but I thought it would be better if this was directly available in the keepercommander code base, so that in the future refactorings would occur, the pickling would still work and we just have to pickle the KeeperParams without being worried it could break in the future.

Of course I added a unit test which tests the pickling and makes assertations on the fields that should be pickled and those that should not.