Open btkador opened 8 years ago
I am not sure if I should consider this as a bug or as a feature request.
For example, you may want to create a rules that match queries where negative floats are inserted: if -12.13123
is replaced with ?
instead of -?.?
, that won't be possible anymore.
The same goes for HEX.
With regards to spaces: the tokenizer replaces multiples spaces with one single space, but won't remove spaces completely.
If the application generates queries in a way where spaces or spaces can become problematic, you should probably not use digest
and use match_digest
with more detailed regex.
Note that adding complexity to the tokenizer will slow down the Query_Processor no matter which queries is processing, that means slower by default. I think is better to leave the tokenizer simpler, and add complexity in regex.
Thinking out loud, this could be a feature request: define a list of match/replace regex to apply to digest_text
after the tokenizer.
But it is quite complex to do this right now.
I think that results more in a "documentation" label for this issue, instead of feature request :-)
How is tokenizer replacing values? I expected it always replaces all kind of dynamic values by ?
varchar
or unsigned integer
get's replaced by ?
, right?signed integer
or signed long
will keep it's -
sign in front (in case it is set)?long
always keeps its .
?NOW()
or INET_ATON()
will be kept like it is?
is the digest_text supposed to look like that with a signed long value?
Instead of the expected
?
there is-?.?
for that value (i.e. -12.13123) . Also it seems that if there is a value inserted as HEX viaX"<value>
it results inX?
.Not sure about the
NOW()
since it's a mysql function, but instead there could also be sent a integer value which will result in?
and therefore not match the rule.Also I've seen that there is a difference in white spaces.. i.e sometimes its
(?,?,?,?,?)
and sometimes(?, ?, ?, ?, ?)
which will result in different digest hash I guess.