Open dbp opened 10 years ago
That's an interesting observation. I've never come across this as I've never used ShowConstant
with a string containing "special" characters. If you have the time could you post an example of a very simple query (say the Opaleye equivalent of "SELECT 'mystring' from TABLE') where 'mystring' contains a newline and other special characters, and the SQL it generates? It would also be interesting if you could look into how HaskellDB treats strings under SQL generation. I think it will be in here somewhere: https://github.com/tomjaguarpaw/haskelldb/tree/master/src/Database/HaskellDB/Sql -- I will also investigate further myself, but it's very late here now!
(By the way, I don't think HaskellDB's SQL generation actually uses parametrised queries. We should look further into this in case there are security issues.)
Here's a query:
testQuery = proc () -> constant ("hello \n world" :: Text) -< ()
And printing it with the help of the following helper:
showQuery = putStrLn . show . showSqlForPostgresDefault
Results in the following on the terminal (the extra show is to get the string literal, vs that rendered by putStrLn. String escaping is hard :)):
"SELECT 'hello \\n world' as constant_1"
This is in contrast to printing the string as follows:
putStrLn (show ("hello \n world" :: Text))
Which results in:
"hello \n world"
As expected (and yes, this isn't a funny rendering thing. If I get rid of the extra shows, the first case has the literal "\" and "n" in it, whereas the latter prints on two separate lines).
oh, and @tomjaguarpaw I did look at what it does. What appears to be the problem is the call to quote here: https://github.com/tomjaguarpaw/haskelldb/blob/11668fa3b2ad568b429bbbbcc235addf080540cb/src/Database/HaskellDB/Sql/Default.hs#L518 , which calls escape (just below), which has the behavior of turning special characters into strings representing them. Now I'm assuming that this was supposed to counteract a de-escaping (by way concatenating strings as raw SQL) later on, but I don't understand the whole flow well enough yet to say that for sure.
Thanks for doing that research. HaskellDB has a lot of rough edges. It's not unlikely that it's just doing completely the wrong thing. I'll take a look at it either tomorrow or during the week.
Doesn't the example you provided actually show correct behaviour on the part of HaskellDB? The newline should indeed be escaped in the final SQL string. It doesn't seem that it is being escaped twice.
You talked about writing to the database, so I wonder if it comes from one of the manipulation functions. I shall investigate further.
Well, it depends what happens to the string next... If it is passed as-is, then yes, it seems fine. But I assume that postgresql-simple (or another stage along the way) is escaping again... Because what ends up in the database is wrong.
Here's a manipulation example.
> let t = tableOfTableSpecDef (TableSpec "foocolumn" "footable") :: Table (Wire String)
> putStrLn $ show $ arrangeInsertSqlDef t (fmap Just (constant "hello \n world"))
"INSERT INTO footable (foocolumn)\nVALUES ('hello \\n world')"
In the resulting SQL string the newline character has been replaced by backslash (encoded as double backslash) followed by an 'n' character (encoded just as an 'n' character). This is exactly what should happen before the SQL string is sent to Postgres.
So is there something else also escaping? I'm not sure, but will continue looking.
I think this is due to the standand_conforming_strings
parameter. If standard_conforming_strings
is on
then backslashes are interpreted literally. You have to use "escape strings", i.e. strings prefixed by an E
, e.g E'hello \n world'
, to get special characters.
Can you check whether standand_conforming_strings
is on
or off
on your server? (Strangely for me it is on
if I issue a request through postgresql-simple, but off
if I use pgsql
. I don't know how that can happen, but it does seems to be the case.)
It's on for me both via psql and via postgresql-simple (this is a 9.3 server).
So that makes it seem like the problem is that HaskellDB is putting in the characters, expecting them to become special characters, whereas in normal operation with postgresql-simple, the actual newline characters are passed to postgresql, and thus things work fine. To confirm, both of the following result in proper newlines in the DB (and no problems):
execute "insert into table (s) values (?)" (Only "Hello\nWorld" :: Only Text)
execute_ "insert into table (s) values ('Hello\nWorld')"
That sounds like a correct description of the underlying cause. Perhaps you would like to create a branch with a modified escape
function and see how well that works. I am wary of making that change to the main repo as it may have unintended consquences.
So I can confirm that by not performing escapes on newline characters (in Database.HaskellDB.Sql.escape, commenting the line that handles newlines) that newlines are going in and coming out of the database properly, when using postgres 9.3 (which has that parameter enabled).
So if that were the only target, I would say that the escaping is a bug that should be fixed. But, I'm not sure if other configurations (or other databases) will be like that, so I wonder if this is the right place to be addressing the problem.
If we fork/inline haskelldb into opaleye we don't need to care about other backends.
Is there a branch with a solution for this somewhere?
I just ran across this with quotation marks (inserting json into a text column):
λ> showQuery (constant "[\"astring\"]")
"SELECT '[\\\"astring\\\"]' as constant_1"
Currently there is no patch for this. I think what I will do, unless anyone can see an obvious problem, is to make Opaleye generate only "escape strings", i.e. only strings of the form E'This sentence contains a backslash \\ and a quote \'. The end.'
.
Unless someone really needs it I will only fix this in Opaleye 1 which is being released publically on December 1st. You can get earlier access to Opaleye 1 by sending me an email. It's very nice and removes the need for ExprArr
:)
Actually since this is HaskellDB functionality it makes sense just to patch it there now. Can someone please confirm that this works: https://github.com/tomjaguarpaw/haskelldb/commit/caf1eac24bbccd67d2603d5788ea2bfa686f96c0
Yay it works, thanks!
Smallest fix ever!
I'm getting incorrect escape behavior on newlines (so a newline character is saved as the literal string "\n", and if you have a textarea posting, hitting save multiple times results in "\n", "\n", etc),
As far as I can tell, the default behavior is to send Text, via ShowConstant, to a StringLit (from HaskellDB). This then gets escaped, which involves turning '\n' (a newline character) into "\n" (the string consisting of a backslash and then the letter n). I can only assume that postgresql-simple, which is being used to run the queries, is saving those strings literally (as it should!), by way of parametrized queries or something, resulting in the bug.
I imagine that the way that HaskellDB is used elsewhere necessitates this behavior? I'm nervous about turning Text into OtherLit, as that makes it seem like it will just be spliced into the query (which would be bad, obviously), so I'm wondering what the solution is, and how other people have dealt with this.