RegioHelden / django-scrubber

Anonymizer for Django database data
Other
25 stars 10 forks source link

Proposal: Integrate scrubber wrapper #37

Open GitRon opened 2 years ago

GitRon commented 2 years ago

Hi there!

I wrote some time ago a wrapper class for extending and streamlining the scrubbing process. The idea is that stuff that needs to happen, happens under the hood (clear django session table (thats a big deal), truncate scrubber fake data table for reducing the dump size etc) amd stuff that should happen, can be customised by the developer (creating superuser with fixed password, pre- or post-processing).

It's all documented in our Ambient toolbox package: https://ai-django-core.readthedocs.io/en/latest/features/database_anonymisation.html

I wonder if you might be interested in merging this stuff in your package and provide a better and more convenient service for your users.

Best
Ronny

lociii commented 2 years ago

Sounds good. From my point of view, it would definitely make sense to integrate it. Any thoughts @costela? Feel free to provide a pull request so that we can discuss the implementation.

costela commented 2 years ago

This definitely sounds promising! Especially the session table stuff is something that we probably should already be doing! :see_no_evil:

If it's not asking too much, would you mind trying to break this down into 2 or 3 PRs? This way we can merge the low-hanging-fruits fast (like the session table) and discuss the rest?

Thanks for the feedback/ideas!

GitRon commented 2 years ago

Hi guys, well, it's one class and the pattern can't be really be split up. I'll create a PR when I come to it, hopefully some time this week.

GitRon commented 2 years ago

Hi guys, I just went for time reasons with the minimal approach and added two flags for the MC. Deleting sessions is active by default (security by design) and optionally you can remove all fakedata as well to reduce the dump-size (usually after scrubbing, you want to dump your database and put it somewhere).

https://github.com/RegioHelden/django-scrubber/pull/38

GitRon commented 1 year ago

Hi, just wanted to say that this hasn't been forgotten - hopefully I'll be able to create a PR in the next couple of weeks.

GitRon commented 1 year ago

Hi @costela & @lociii

I built the wrapper like this:

  1. create a custom wrapper class which inherits from the base wrapper
  2. create a new management command which calls the wrapper

This seems very complicated for newbies... any suggestions on how to improve that? Maybe we could point to the custom class with a settings variable? So we still have only one management command for everything? If no custom wrapper class is defined, it goes the default way?

What do you think?

costela commented 1 year ago

hey @GitRon

Sorry, the description sounds a bit too abstract for me. Can we see some code to discuss?

Thanks!

GitRon commented 1 year ago

Hi @costela

here's the docs (currently still in our toolbox package): https://ai-django-core.readthedocs.io/en/latest/features/database_anonymisation.html#how-to-use-the-wrapper

Code is here: https://github.com/ambient-innovation/ai-django-core/blob/master/ai_django_core/services/custom_scrubber.py

We use this in > 10 projects and it works really great 😃

Best Ronny