sunitparekh / data-anonymization

Want to use production data for testing, data-anonymization can help you.
MIT License
459 stars 92 forks source link

Unique Constraint Problem with mysql when using Whitelist strategy #35

Closed leifg closed 8 years ago

leifg commented 8 years ago

When I run an anomyzation run with the whitelist strategy, there seems to a problem determining existing records.

For an easy example i use the schema_migrations table (this is a phoenix project so created_at is called inserted_at), but it happens with any other table.

I need to establish connection explicitly because of #24

require 'data-anonymization'

DataAnon::Utils::Logging.logger.level = Logger::DEBUG
db_options = { adapter: 'mysql', host: 'localhost', port: 3306, pool: 5, username: 'root', password: 'password', database: 'my_project_dev' }
ActiveRecord::Base.establish

database 'MyProject' do
  strategy DataAnon::Strategy::Whitelist
  source_db db_options

  table 'schema_migrations' do
  primary_key "version"
  whitelist 'version', 'inserted_at'
end

When I look at the output I find this very strange:

D, [2016-05-24T12:55:40.780067 #55801] DEBUG -- :   SQL (0.2ms)  BEGIN
D, [2016-05-24T12:55:40.785631 #55801] DEBUG -- :   SQL (0.7ms)  INSERT INTO `schema_migrations` (`inserted_at`, `version`) VALUES (?, ?)  [["inserted_at", "2016-04-26 14:53:40"], ["version", 20160320121123]]
D, [2016-05-24T12:55:40.789096 #55801] DEBUG -- :    (3.2ms)  ROLLBACK

Even though the primary key is explicitly set, data-anonymization creates an insert.

something can't be right there.

sunitparekh commented 8 years ago

Give me a weekend and I will surely look into your issue.

Apologies for not getting back quickly.

When I run an anomyzation run with the whitelist strategy, there seems to a problem determining existing records.

For an easy example i use the schema_migrations table (this is a phoenix project so created_at is called #inserted_at):

I need to establish connection explicitly because of #24 https://github.com/sunitparekh/data-anonymization/issues/24

require 'data-anonymization' DataAnon::Utils::Logging.logger.level = Logger::DEBUG db_options = { adapter: 'mysql', host: 'localhost', port: 3306, pool: 5, username: 'root', password: 'password', database: 'my_project_dev' }ActiveRecord::Base.establish

database 'MyProject' do strategy DataAnon::Strategy::Whitelist source_db db_options

table 'schema_migrations' do primary_key "version" whitelist 'version', 'inserted_at'end

When I look at the output I find this very strange:

D, [2016-05-24T12:55:40.780067 #55801] DEBUG -- : SQL (0.2ms) BEGIN D, [2016-05-24T12:55:40.785631 #55801] DEBUG -- : SQL (0.7ms) INSERT INTO schema_migrations (inserted_at, version) VALUES (?, ?) [["inserted_at", "2016-04-26 14:53:40"], ["version", 20160320121123]] D, [2016-05-24T12:55:40.789096 #55801] DEBUG -- : (3.2ms) ROLLBACK

Even though the primary key is explicitly set, data-anonymization creates an insert.

something can't be right there.

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/sunitparekh/data-anonymization/issues/35

leifg commented 8 years ago

No problem,

one thing I discovered I am only using one database in this scenario. As soon as I use a source AND and destination db is works fine.

sunitparekh commented 8 years ago

With whitelist strategy we need to give both the databases source and destination. I will see if I can add some validations so that error reporting is better in tool. for now closing the issue.