kibitan / masking

Command line tool for generating anonymized database for MySQL/MariaDB
MIT License
117 stars 2 forks source link

adding a configuration type (masking.yml) that doesn't change data cardinality #27

Open kibitan opened 5 years ago

kibitan commented 5 years ago

background

when we want to use anonymized data for performance tuning purpose, we don't want to change data cardinality in records, otherwise, it is not showing the correct sql execution plan.

grammar

pass with digit

users:
  email:  anonymized+%{hash,4}@example.com

expected result

before

expected result

MySQL [mydb]> SELECT id, email FROM users ORDER BY id;
+----+----------------------------+
| id | email                      |
+----+----------------------------+
|  1 | test1@gmail.com            |
|  2 | some1@hotmail.com          |
|  3 | test1@gmail.com            |
+----+----------------------------+
MySQL [mydb]> SELECT id, email FROM users ORDER BY id;
+----+----------------------------+
| id | email                      |
+----+----------------------------+
|  1 | anonymized+fdas@example.com | <- same data outputs same data
|  2 | anonymized+32bf@example.com |
|  3 | anonymized+fdas@example.com | <- same data outputs same data
+----+----------------------------+

Tech tips

kibitan commented 1 year ago

this is relates with #71

kibitan commented 1 month ago

check this repo too: snaplet/copycat and this: docs.snaplet.dev/references/data-operations/transform

kibitan commented 1 month ago

@tonklon voted for this feature in 2014 Sep at frankfurt.rb, thank you for the feedback!