rosmo / mydumper-anon

Anonymize or modify data on the fly when creating MySQL data dumps using simple YAML configuration file
13 stars 4 forks source link

== What is mydumper? Why? ==

It does not support schema dumping and leaves that to 'mysqldump --no-data'

== How to build it? ==

Run: cmake . make

One needs to install development versions of required libaries (MySQL, GLib, ZLib, PCRE): NOTE: you must use the correspondent mysql devel package.

One has to make sure, that pkg-config, mysql_config, pcre-config are all in $PATH

Binlog dump is disabled by default to compile with it you need to add -DWITH_BINLOG=ON to cmake options

=== MacOSX homebrew formula ===

Homebrew is a package manager for MacOSX.

brew install https://raw.githubusercontent.com/rosmo/mydumper-anon/master/homebrew/mydumper-anon.rb

== How does consistent snapshot work? ==

This is all done following best MySQL practices and traditions:

This for now does not provide consistent snapshots for non-transactional engines - support for that is expected in 0.2 :)

== How to exclude (or include) databases? ==

Once can use --regex functionality, for example not to dump mysql and test databases:

mydumper --regex '^(?!(mysql|test))'

Of course, regex functionality can be used to describe pretty much any list of tables.

== How to exclude MERGE or Federated tables ==

Use same --regex exclusion syntax. Again, engine-specific behaviors are targetted for 0.2

== How to anonymize tables ==

You can specifying a YAML configuration file to the --anonymize flag. Anonymization configuration supports truncation of tables, column editing (replace contents, randomize with wordlists, randomized date/time/datetime).

Example of all operations currently implemented:

anonymizer_settings: randomize:

wordlist_id: file_with_words_one_per_line.txt

 firstname: firstnames.txt
 lastname: lastnames.txt
 fullname: fullnames.txt

database_name: table_name_1: truncate: yes

table_name_2: edit: