playframework / playframework

The Community Maintained High Velocity Web Framework For Java and Scala.
http://www.playframework.com
Apache License 2.0
12.56k stars 4.1k forks source link

Add some configs on how the evolutions script parser splits semicolons #12968

Open mkurz opened 4 days ago

mkurz commented 4 days ago

Currently db evolutions scripts will be split on each semicolon (;). Also currently, we (only) have one escape mechanism for that - double ;;.

However, sometimes it's not desired to split on each semicolon and in such cases it can be very cumbersome to manually escape those semicolons.

This is specially true when statements are generated by an external tool which we can not really control.

Two use cases:

First: If you are importing data from another datasource, via a tool or just copy paste them from somewhere, for example with lots of insert statements, it could happen that those statements contain varchars that contain semicolons. E.g.

INSERT INTO foo (col1, col2, ...) VALUES ('abc; def, 'ghi; jkl', ...);
INSERT INTO foo (col1, col2, ...) VALUES ('mno; pqr', 'stu; vwx', ...);
--- 1500 more lines

So first statement inserts strings abc; def and ghi; jkl. Now devs would have to manually escape all the semicolons within the varchar, which is very cumbersome and likely error prone. It would be much nicer to have a mechanism to tell play evolutions reader to just split on ; if it is the last in a line - or make all the ; escape automatically which are not the last one in a line. Both would have the same outcome, so it's just how you want to describe it.

Second: Sometimes you do not want to excape any semicolon in a line, even not the last one. This is specially true for stored procedures.

My idea on how to solve this in general would be to introduce two configs:

play.evolutions.semicolon.splitOnlyIfLastLineCharacter.labelStart = [ ] # E.g. [ '!split-last-semicolon-start' ]
play.evolutions.semicolon.splitOnlyIfLastLineCharacter.labelEnd = [ ] # E.g. [ '!split-last-semicolon-end' ]

so in our first case we could just wrap the sql statements where we want to split on last semicolon (again, only if it is the last character in a line):

-- !split-last-semicolon-start
INSERT INTO foo (col1, col2, ...) VALUES ('abc; def, 'ghi; jkl', ...);
INSERT INTO foo (col1, col2, ...) VALUES ('mno; pqr', 'stu; vwx', ...);
--- 1500 more lines
-- !split-last-semicolon-end

For the second case:

play.evolutions.semicolon.splitNever.labelStart = [ ] # E.g. [ '!dont-split-semicolon-start' ]
play.evolutions.semicolon.splitNever.labelEnd = [ ] # E.g. [ '!dont-split-semicolon-end ]
CREATE PROCEDURE ...
BEGIN
-- !dont-split-semicolon-start
DECLARE ...;
some_loop: LOOP
IF something THEN
LEAVE some_loop;
END IF;
SET @sql = ...;
PREPARE stmt FROM @sql;
EXECUTE stmt;
END LOOP;
-- !dont-split-semicolon-end
END
$$

So in the end we would make the splitting much more flexible.

Also by having the start and end label being an hocon array, we could easily solve a long standing ebean problem by just making play-ebean append a custom start and end label if we convince the ebean author to add meta data comments to its stored procedure.

ihostage commented 4 days ago

Hi, Matthias! :wave: How about the next variant? Have only one config

play.evolutions.semicolon.strategy.label = [ ] # E.g. [ '!split-semicolon' ]
-- !split-semicolon: last
INSERT INTO foo (col1, col2, ...) VALUES ('abc; def, 'ghi; jkl', ...);
INSERT INTO foo (col1, col2, ...) VALUES ('mno; pqr', 'stu; vwx', ...);
--- 1500 more lines
-- !split-semicolon: always
CREATE PROCEDURE ...
BEGIN
-- !split-semicolon: never
DECLARE ...;
some_loop: LOOP
IF something THEN
LEAVE some_loop;
END IF;
SET @sql = ...;
PREPARE stmt FROM @sql;
EXECUTE stmt;
END LOOP;
-- !split-semicolon: always
END
$$
mkurz commented 4 days ago

Yes also good idea.

My idea however was that we can also handle labels that we can not control, for example see my PR here: https://github.com/ebean-orm/ebean/pull/3408/files I will wait for the ebean author says and then we can continue thinking about this.