Open phainom opened 4 years ago
@phainom is it possible to resolve this if you escape {
used in the regex? E.g. "{{{a},}}".format(a="a")
works for me. See https://docs.python.org/3.3/library/string.html#format-string-syntax Not sure if this resolves the issue. If you provide an example I can see if I find a fix.
Say the query in the .sql file is something like
select 'hello world' as col1 where 'aa' similar to 'a{1,2}'
This will work in raw sql, but throw an error when used with dbrequests, since it will look for a parameter. What do you mean by escaping in that case? Escaping the {
will lead to a sql error or a different meaning of the regex.
btw, we should move from .format to f-strings for better readability (https://www.python.org/dev/peps/pep-0498/).
What you can do to partially fill a string containing curly brackets is to set double brackets.
"select 'hello world' as {col} where 'aa' similar to 'a{{1,2}}'".format(col="col1")
# returns "select 'hello world' as col1 where 'aa' similar to 'a{1,2}'"
If I have:
query.sql
SELECT {val} regexp "a{{1,2}}";
So instead of {1,2}
we write {{1,2}}
. Then I can execute:
db.send_query(
dbrequests.Query("query.sql", val="'aa'", sql_dir="../").text
)
and
BTW: I just noticed that format is only executed if the query lives in a file, not if I pass a query as string.
Got it, you're right, I can introduce double curly brackets as soon as I call the sql from dbrequests. What I meant is that I cannot call a fully functioning sql via dbrequests, but have to break the sql to make it work (even for a query where nothing is parametrized). This breaks the most basic feature of the package, namely "Easy sending of raw sql and output as pandas DataFrames". I don't know if there is an intuitive solution for this though, we would probably need to check the passed kwargs before reformatting the query.
BTW: I just noticed that format is only executed if the query lives in a file, not if I pass a query as string.
I would like to view that as a feature rather than a bug. :grin:
What I meant is that I cannot call a fully functioning sql via dbrequests, but have to break the sql to make it work (even for a query where nothing is parametrized). This breaks the most basic feature of the package, namely "Easy sending of raw sql and output as pandas DataFrames".
I agree. We just have an implementation detail - the use of format - bubbling up here.
I don't know if there is an intuitive solution for this though, we would probably need to check the passed kwargs before reformatting the query.
I see two options:
{len(arg)}
is a legal expression. So it's not going to be easy unless python offers something to help us I don't see yet.like you say, we try to check in advance for keywords. I find this difficult, because format allows also for function calls, e.g. {len(arg)} is a legal expression. So it's not going to be easy unless python offers something to help us I don't see yet.
I would argue that writing a .sql with {len(arg)} instead of calling len when passing arg in send_query is bad practice, and we could think about not allowing this. We would need to call locals()
, compare with the output of string.Formatter.parse() and auto-escape brackets which don't match passed arguments. This also has the advantage of being able to throw a more indicative error when passing non-matching keyword arguments. But I also agree that this is probably overly complicated and prone to errors due to edge cases. Switching to a different system will break all code which uses the {parameter} format, if we do this we could just move to a different package.
If we don't like these options, I can continue to circumvent the problem and we resolve this issue with wontfix.
I have an issue when using regular expressions in SQL, eg via
similar to
and {}, because the format() resolution thinks this points towards a parameter. There should at least be an option to pass the raw sql in Query. Currently I circumvent the issue with usingdb.query
instead ofdb.send_query
, which then needs separate boilerplate code for passing the right query.