f3-factory / fatfree-core

Fat-Free Framework core library
GNU General Public License v3.0
208 stars 87 forks source link

SQL mapper auto-quoting mechanism #244

Open xfra35 opened 6 years ago

xfra35 commented 6 years ago

This is a followup of this conversation.

The SQL mapper tries to automatically quote the keys that are located in the ORDER BY clause. In most cases, this mechanism is working well, but there are some special cases where it doesn't work (mainly when a function is part of the order clause).

In order to fix that, we now check if the order clause is not already quoted before enabling auto-quoting.

Anyway, having to manually quote expressions as they get more complex is a pity, when the mapper is supposed to abstract away SQL engines syntax differences.

I created this issue, so we can think of a way to sort this out in the next major release.

The ideal way to handle this would be to improve the auto-quoting mechanism so that it can handle all possible cases, but I doubt this is possible, as discussed in the conversation above.

Another way to handle this would be to have a F3-specific quoting character, that would allow to manually quote order clauses (and why not filters), without having to resort to engine-specific quoting characters.

ikkez commented 6 years ago

I wrote a "little" preg_replace for cortex to test some auto quotation. What do you think?! https://github.com/ikkez/f3-cortex/blob/05ccecb201e6dc624418a4b12f54d549ce5a1bc7/lib/db/cortex.php#L2624-L2636 works with a lot of order statements, like

['order' => 'year desc, title']
['order' => 'field,FIELD(status,3,2,1)']
['order' => 'desc DESC,last DESC NULLS FIRST,first NULLS LAST']
['order' => 'IF(FIELD(id,3,2,1,4)=0,1,0) DESC,FIELD(id,3,2,1,4)']

see: https://regex101.com/r/eArmDM/1

xfra35 commented 6 years ago

Wow that's impressive! You've just reached the Master Of Regex level :laughing:

I've tried all the edge cases that were described in the initial conversation, and they all seem to be covered.

Can you explain a little bit what's the rationale behind it? I got a little bit confused by the comments.

ikkez commented 6 years ago

Haha, yeah thanks for the flowers, but actually it's a simple thing: we want to match all fields, so it's basically a word (\w), but it must not be only numbers and it could be prefixed by a table/schema selector, so I came to (\b\d?[a-zA-Z_](?:[\w\-.])*). Now this also matches all the function names like FIELD(). The idea was that functions are always followed by parentheses, so \w+\h?\( should match this. I also assume that in an order statement, fieldnames are only the first "word" (no space-char involved) in a group, seperated by a ,. Anything else are further arguments for the order operation, like DESC NULLS LAST. So \b(?!\w+)(?:\s+\w+)+ and \)\s+\w+) match such arguments after a fieldname or closing function parentheses. Actually I first had an alternative pattern like (?<!\w\h|\)\h)(?!\w+\h*\()(\b\d?[a-zA-Z_](?:[\w\-.])*) which reads more natually and utilizes more lookahead/lookbehind conditions, but it's slower than the current approach to just fetch all "skippable" matches in an extra group too, and just return them untouched within the callback. 😅