canonical / sqlair

Friendly type mapping for SQL databases
Apache License 2.0
17 stars 8 forks source link

Allow SQL functions as column names #82

Closed Aflynn50 closed 6 months ago

Aflynn50 commented 1 year ago

Relax the restriction that columns in output expressions can only be of the form prefix.name. SQLair will now accept func(...) as a column name in output expressions of the form func(...) AS &Type.tag.

The functions are parsed as a string made up of alphanumeric chars + _ following by an opening bracket containing anything at all (including SQLair io expressions).

An error will be thrown if the user tries to put a function into a map or struct with an asterisk, i.e. func(...) AS &MyStruct.*.

Functions in output expressions can now also contain io expressions e.g. func($P.id) AS &M.id. This has been achieved by refactoring Prepare to generate the SQL inside prepareOutput and prepareInput and build up the ParsedExpr progressively rather than doing a double pass (prepareParts can be called recursively from inside prepareOutput, adding the prepared information to the ParsedExpr on the fly)

Aflynn50 commented 10 months ago

Notes from an in person discussion with @niemeyer and @letFunny.

An aim of this PR is to allow users to write input statements inside functions that are part of output expressions. For example: SELECT max($Person.id, 10) AS &Person.id FROM .... Output expressions are currently parsed into an outputPart. This includes the list of columns captured in the output expression. The columns must now contain a representation of the function, including any inputParts it contains.

At the time of writing, this PR saves functions as a ParsedExpr which is stored within the outputPart. When the SQL is generated the ParsedExpr must be recursed over by Prepare to generate the SQL for the function (including any inputParts within it). This requied some changes to the way preparing worked such as building up the SQL at the same time as it is prepared rather than doing it afterwards.

An alternative method of parsing the functions would be to not save them in the output expression. For example: `Bypass[SELECT max(] Input[Person.id] Bypass[, 10)] Output[Person.id]. The limitation with this approach is that it is not easily possible to parse functions in the current syntax. For example:

   SELECT (col1, max($Person.id, 10), col3) AS (&Person.id1, &Person.id2, &Person.id3) FROM ...
-> Bypass[SELECT (col1, max(] Input[Person.id] Bypass[, 10), col3)] Output[Person.id1 Person.id2 Person.id3]

Parsing like this makes it a lot harder to generate the SQL. The parser would need to remember the position of each of the columns somewhere to generate the SQL needed:

SELECT col1 AS _sqlair_0, max($Person.id, 10) AS _sqlair_1, col3 AS _sqlair_2 FROM ...

This appraoch is implemented here, but only for functions associated with a single type rather than in a list of columns.