lpsmith / postgresql-simple

Mid-level client library for accessing PostgreSQL from Haskell
Other
206 stars 71 forks source link

Problem with question marks inside string literals #169

Open mohsen3 opened 8 years ago

mohsen3 commented 8 years ago

Executing queries like SELECT * FROM table WHERE column = 'what?' fails with FormatError {fmtMessage = \"1 '?' characters, but 0 parameters\", fmtQuery = ...}. I had a look at the source code and it seems it comes from the buildQuery function from Database.PostgreSQL.Simple module, because this function naïvely replaces all the ?s with the parameters. It is funny that it fails to execute the query even if there is a question mark inside the sql comments.

Currently, as a workaround, I am replacing question marks with "\\x3F" which is the hex code for ?. It escapes from the library parser, but the postgres server parses that as a valid question mark character.

I am not sure if this is a good idea to fix the problem, since it may make the parser much slower and more complicated.

lpsmith commented 8 years ago

Yes, this is an unfortunate defect inherited from mysql-simple. And I'm fully aware that it's been in every release of postgresql-simple.

Thanks for the workaround, I hadn't worked out all the details as I haven't needed it yet. Although, sql comments are stripped out if you use the sql quasiquoter. Perhaps it would make sense to also adjust the quasiquoter to include your workaround... but on the other hand this behavior would be a bit opaque.

And yes, I've been reluctant to fix it because a proper fix is somewhat involved, and will likely slow things down, though I also reserve the right to fix it. (It would seem that a very small number of people actually rely on this behavior, which they shouldn't.)

There are two things that I think need doing, which would make this issue much less relevant:

  1. Support protocol-level parameters, so that we don't necessarily have to process the query string at all at run time.
  2. Offer an interpolating quasiquoter that can construct a query string at run-time while parsing at compile-time. (And yes, this is necessary, because there are a couple of situations where protocol-level parameters don't work, and you do want interpolated parameters.) See for example #120 where there's some discussion of the possibilities.
mohsen3 commented 8 years ago

About our specific application, I am not sure if quasiquoter is so helpful since we have to create the queries dynamically. Also, removing the comments in all cases might not be a good idea since at least in our case, we use the comments to include some debug info such as the job id that triggers the sql query and needs to be preserved.

lpsmith commented 8 years ago

Well, if your application is mostly dynamic sql, then perhaps it would make sense to avoid the existing interpolator altogether. The escaping functions are exported in Internal, and as part of the issue I linked to, I am hoping to implement pure escaping functions for postgresql-simple in the near future.

mohsen3 commented 8 years ago

How about adding an option to the connection and allowing the user to configure whether or not he likes to have the question marks interpreted?

lpsmith commented 8 years ago

Well, postgresql-simple has the query_ family of functions that avoid interpolation altogether.

I'm hoping that once postgresql-simple implements the plan outlined above, that the existing interpolator will be much less relevant. And then worrying about performance would be less important...

rightfold commented 8 years ago

What is the current status on supporting protocol-level parameters?

lpsmith commented 8 years ago

They aren't. It'd be nice to add support for protocol level parameters, but unfortunately it's not as simple as it should be (for some entirely silly reasons on postgresql's part). And, I haven't worked on it, yet.

Similarly, supporting binary result formats has some unnecessary complications for some entirely silly reasons on libpq's part.

lpsmith commented 8 years ago

If one is desperate for protocol-level parameters for a select few queries though, you certainly could use the Internal.withConnection and postgresql-libpq for that. It's not pretty, but it would avoid the need to switch to one of the other postgresql libraries that do support protocol level parameters (but also have their own issues to contend with.)