sunitparekh / data-anonymization

Want to use production data for testing, data-anonymization can help you.
MIT License
459 stars 92 forks source link

Use update_columns instead of save! for performance #57

Closed JasonBarnabe closed 6 years ago

JasonBarnabe commented 6 years ago

update_columns skips callbacks, validations, and transactions. It "is the fastest way to update attributes".

On a local postgres table with 40000 rows, batch size 1000, anonymizing a single email field.

Before changes: 2m 52s After changes: 1m 32s (46% faster!)

coveralls commented 6 years ago

Coverage Status

Coverage increased (+0.003%) to 93.797% when pulling 7082127f4ba072036a7fc2407ef043597da94385 on kickbooster:update_columns into db4f509dd9448fb2cfd25e4bb15c3d9116daead0 on sunitparekh:master.

coveralls commented 6 years ago

Coverage Status

Coverage increased (+0.003%) to 93.797% when pulling 7082127f4ba072036a7fc2407ef043597da94385 on kickbooster:update_columns into db4f509dd9448fb2cfd25e4bb15c3d9116daead0 on sunitparekh:master.

coveralls commented 6 years ago

Coverage Status

Coverage increased (+0.003%) to 93.797% when pulling 7082127f4ba072036a7fc2407ef043597da94385 on kickbooster:update_columns into db4f509dd9448fb2cfd25e4bb15c3d9116daead0 on sunitparekh:master.

sunitparekh commented 6 years ago

Merged. Thanks for contribution. Will publish next version soon.

krainboltgreene commented 6 years ago

Woo!