nikita-volkov / hasql

The fastest PostgreSQL libpq-based driver for Haskell
http://hackage.haskell.org/package/hasql
MIT License
519 stars 55 forks source link

Hasql.Encoders allows encoding an "array of unknown", which doesn't (can't?) work #147

Open robx opened 2 years ago

robx commented 2 years ago

unknown encodes as postgres protocol Text format, while everything else encodes to Binary format. But this generates an invalid mix of Text and Binary, when unknown values occur in arrays. E.g. the following test case fails (not that it necessarily should pass, but maybe it should fail to type check?):

    testCase "Array of unknown" $
    let
      io =
        do
          x <- Connection.with (Session.run session)
          assertEqual (show x) (Right (Right 3)) x
        where
          session =
            Session.statement ["0","1","2"] statement
            where
              statement =
                Statement.Statement sql encoder decoder True
                where
                  sql =
                    "select array_length($1::int8[], 1)"
                  encoder = Encoders.param (Encoders.nonNullable (Encoders.array (Encoders.dimension foldl' (Encoders.element (Encoders.nonNullable Encoders.unknown)))))
                  decoder =
                    Decoders.singleRow ((Decoders.column . Decoders.nonNullable) Decoders.int8)
      in io

Not sure this is a big deal -- might be sufficient to document this as unsensible for the unknown docs? Another thought I had was to parametrize Value with LibPQ.Format, along the lines of

unknown :: Value Text ByteString
element :: NullableOrNot (Value Binary) a -> Array a
array :: Array a -> Value Binary a

(Context: This came out of experimenting with bypassing postgresql-binary for unknown, related to #146, https://github.com/nikita-volkov/bytestring-strict-builder/pull/11. Essentially I wanted to change Value to

data Value = 
    ValueBinary PTI.OID PTI.OID (Bool -> a -> B.Encoding) (a -> C.Builder)
  | ValueText PTI.OID PTI.OID (a -> ByteString) (a -> C.Builder)

but that fails at element.)

steve-chavez commented 2 years ago

I recall having to workaround this limitation in PostgREST here.

Maybe not related, but there was another limitation with unknown and bytea here.

nikita-volkov commented 2 years ago

I don't remember precisely why it was decided to make unknown encode in a text format. Possibly to let the users avoid dealing with binary format. Any way, it doesn't make an impression of a consistent API now. What you say about the array incompatibility is true and is a direct consequence of that. I've updated the docs to mention that.

I would replace that encoder with a binary one, but I sense that you're not the only people who have already applied it in practice. I have another idea of what can be done about it to make it safe though.

If we move it from the level of Value to Params, then it would make it impossible to construct array encoders from it. Also it seems like it may be useful to let the user optionally specify a specific type OID. That would virtually turn it into a hook that lets users create custom encoders based on textual format. A similar thing with binary format can be introduced later on as well.

What do you guys think of all this?

robx commented 2 years ago

It seems reasonable to me that unknown parameters should be in text format (want to link some documentation that supports this but am coming up short...).

From my point of view, trying to deal with the large JSON bodies in Postgrest, it's a bit hard to ask for changes specifically for unknown, because it seems largely incidental that we use unknown in text format here as opposed to json in binary format (via jsonBytes).

I'm not sure I understand your Value/Params suggestion correctly, but it seems promising. Would it essentially provide a way to bypass the bytestring builder, allowing users to construct Params by supplying postgresql-libpq's (Oid, a -> ByteString, Format)? We'd then need to be able to use such a Param to build SQL.Snippet, which seems built around Value currently.