humanmade / go-anonymize-mysqldump

Allows you to pipe data from mysqldump or an SQL file and anonymize it.
GNU General Public License v3.0
60 stars 26 forks source link

Get this working with multi-line statements #5

Closed nathanielks closed 4 years ago

nathanielks commented 4 years ago

I discovered today that this won't be able to parse multi-line insert statements like the following SQL:

INSERT INTO `wp_cavalcade_jobs` VALUES
(1,1,"wp_version_check","a:0:{}","2017-08-24 08:15:12","2017-09-11 20:15:12",43200,"failed"),
(2,1,"wp_update_plugins","a:0:{}","2017-08-24 08:15:12","2017-09-11 20:15:12",43200,"failed"),
(3,1,"wp_update_themes","a:0:{}","2017-08-24 08:15:12","2017-09-11 20:15:12",43200,"failed"),
(4,1,"wp_scheduled_delete","a:0:{}","2017-10-15 10:11:15","2017-10-15 10:11:15",86400,"failed");

Because we're parsing line by line, it reads each value as its own SQL statement which breaks the parser. I need to either merge them all into a single line or find some other solution.

nathanielks commented 4 years ago

It might be possible to activate a flag saying "treat the following queries like they're an insert statement for table X until we receive a new query."

nathanielks commented 4 years ago

We could also store the initial INSERT INTO statement and then prepend it onto the following lines until a new query is received.

Another option with the flag option I presented before would be to look for a ; at the end of the line, which would signify that query is terminated.

nathanielks commented 4 years ago

Fixed via https://github.com/humanmade/go-anonymize-mysqldump/commit/61769d68f2df46c31d1fd1c6bbcca6c23676ef84