haskellari / postgresql-simple

Mid-level client library for accessing PostgreSQL from Haskell
Other
85 stars 43 forks source link

Document encoding of `formatQuery` #80

Open dustin opened 2 years ago

dustin commented 2 years ago

formatQuery returns a ByteString which is meant to be printed out (debugging, logging, etc...) but the character encoding is not specified. I assume it's UTF-8, but it'd be good to clarify that in the docs. (Or alternatively, return Text).

formatQuery :: ToRow q => Connection -> Query -> q -> IO ByteString
phadej commented 2 years ago

It's not that simple, it relies on what libpq does, e.g. in https://hackage.haskell.org/package/postgresql-libpq-0.9.4.3/docs/Database-PostgreSQL-LibPQ.html#v:escapeStringConn

... and also what the Query literal arguments were (as those are not processed at all).

I.e. I actually have no idea what the result is when you insert non-ascii text field values. How they are escaped, or whether it indeed depends on Connection state / settings.

dustin commented 2 years ago

OK, this is helpful. It does seem to be related to client encoding, which can be specified on the connection string. It might be useful to mention some of this here. As it is, my configuration appears to work OK with UTF-8, but that's a guess and character encoding guesses can go bad. :)

phadej commented 2 years ago

There is already a paragraph saying

This function is exposed to help with debugging and logging. Do not use it to prepare queries for execution.

And the resulting type is ByteString. Perfectly you wouldn't assume anything about it, i.e. your logger can take arbitrary ByteStrings and log them in human readable way when possible and not when they are not.

(EDIT: I often use a decodeUtf8Lenient in situations like this, i.e. utf8 decoding function which doesn't fail on invalid input. You may chose to use something different, it depends. I'd like to avoid giving some general advice, as I really cannot).

dustin commented 2 years ago

Yeah, that's what I'm doing. I just want to make sure it's even a sensible first guess as well as what might affect it.

This is one of those things that might work well in development but not work at all in production and then wouldn't be noticed until stuff was broken and we needed it. :)

phadej commented 2 years ago

As I said, I'll avoid giving any general advice here. You should figure out what works in your setup.