armenak / DataDefender

Sensitive Data Management: Data Discovery and Anonymization toolkit
Apache License 2.0
145 stars 55 forks source link

Guarantee uniqueness for columns #8

Open zbateson opened 9 years ago

zbateson commented 9 years ago

An additional option to guarantee one value isn't reused - for instance if email column is set as a UNIQUE index for a table.

This may be an issue if the number of possible values is less than the number of records in a table. The solution should (eventually, perhaps initially it's enough to require a list of possibilities large enough) have a way of guaranteeing uniqueness if a value has already been used perhaps with the addition of a 'Xeger' pattern.

armenak commented 8 years ago

This issue has been transferred to JIRA.

armenak commented 8 years ago

This issue has been transferred to JIRA.

LucaFilipozzi commented 5 years ago

Conversely, it would be useful for the same transformation to be applied for the same value, everywhere it is seen. For example: luca.filipozzi@gmail.com becomes random@random.com everywhere.