lpsmith / postgresql-simple

Mid-level client library for accessing PostgreSQL from Haskell
Other
206 stars 71 forks source link

Missing datatype-geometric support in FromField #154

Open victoredwardocallaghan opened 9 years ago

victoredwardocallaghan commented 9 years ago

Hi,

Looking over: http://hackage.haskell.org/package/postgresql-simple-0.4.10.0/docs/Database-PostgreSQL-Simple-FromField.html

It seems we are missing support for geometric datatypes, in particular, point.

See: www.postgresql.org/docs/9.4/static/datatype-geometric.html

Kind Regards, Edward.

lpsmith commented 9 years ago

Indeed. You can add support for these types in your own projects by writing appropriate Haskell data types and associated FromField/ToField instances. Supporting this type of extensionality was a key concern in postgresql-simple from day one.

If you come up with something good, these types might be a good candidate for inclusion into vanilla postgresql-simple. Or at the very least, you could create your own package.

victoredwardocallaghan commented 9 years ago

Hi,

Thanks for the fast response! Yea so I am not exactly sure how to derive that instance in this particular case? I normally abuse GHC.Generic for this sort of thing, black magic to me.

Kind Regards, Edward.

lpsmith commented 9 years ago

You can't derive ToField/FromField instances, as you need to know the syntax that postgresql recognizes and emits. See for example the documentation for the FromField module.

victoredwardocallaghan commented 9 years ago

After bashing my head on the desk for a good while I ended up with:

instance (DB.FromField a, Typeable a, Read a) => DB.FromField (a, a) where
  fromField f mdata = if typeOid f /= typoid point
                        then returnError Incompatible f ""
                        else case B.unpack `fmap` mdata of
                               Nothing  -> returnError UnexpectedNull f ""
                               Just dat -> 
                                  case [ x | (x,t) <- reads dat, ("","") <- lex t ] of
                                    [x] -> return x
                                    _   -> returnError ConversionFailed f dat

Can you tell me what you think? no idea if I am close or not really..

lpsmith commented 9 years ago

Well, that looks like it should work, if you are careful. I'm not sure that I'd recommend adding an FromField instance for tuples though, because they already have FromRow instances. There may not be anything too inherently wrong with that, but wouldn't do that in vanilla postgresql-simple because I think it would lead to more confusing error messages and more subtlety. I'd recommend creating your own datatype instead, something like data Point = Point {-# UNPACK #-} !Double {-# UNPACK #-} !Double

However, there's a couple of more concrete issues here, first of all, a point doesn't contain arbitrary types in the field, it only contains double precision floating point numbers. So your instance is claiming to be far more general than it actually is. (Not to mention, that you could remove the DB.FromField a constraint from your code as is, because you are not actually using it here.)

Also, you'd get far better performance by not unpacking the bytestring, and writing your own parser to work directly on bytestrings. You could use attoparsec, and it should be very simple to do in this case.

Incidentally, you can replace the list comprehension with readMaybe if you are using a sufficiently recent version of GHC and you want to stick with using the read module. If you adopt the datatype definition I gave above, you could write:

case readMaybe dat of
  Just (x,y) -> return (Point x y)
  Nothing -> returnError ConversionFailed f dat

Also, you might find these instances useful for understanding what postgresql is returning:

instance FromField TypeInfo where
    fromField f _mv = typeInfo f
newtype Raw = Raw (Maybe ByteString) 

instance FromField Raw where
   fromField _f mv = pure (Raw mv)

Of course, you could have another datatype which returns both bits of information, and this isn't code you would want to have in production, or on hackage.

victoredwardocallaghan commented 9 years ago

Thanks for that! Yes, I realise my instance was a little too polymorphic, I was just trying to manage getting things to type-check without knowing what I was doing all too well. This is what I winded up with so far:

-- | instance to handle (Double, Double)
-- see: https://github.com/lpsmith/postgresql-simple/issues/156
-- use (uncurry Point x) to wrap (a,a) with Point
data Point = Point {-# UNPACK #-} !Double {-# UNPACK #-} !Double deriving (Eq, Show, Read, Typeable, Generic)

instance DB.FromField Point where
  fromField f mdata = if typeOid f /= typoid point
                        then returnError Incompatible f ""
                        else case B.unpack `fmap` mdata of -- XXX attoparsec parser needed..
                               Nothing  -> returnError UnexpectedNull f ""
                               Just dat -> 
                                  case readMaybe dat of
                                    Just x  -> return x
                                    Nothing -> returnError ConversionFailed f dat

The problem for me is that I don't really understand how to do that attoparsec parser for such a bitstream as I really don't understand pg's bit-level representation of its types. The above thus naturally results in a miss conversion due to the lack of the parser bit:

ConversionFailed {errSQLType = "point", errSQLTableOid = Just (Oid 24583), errSQLField = "geoloc", errHaskellType = "Point", errMessage = "(0,1)"}

What documentation were you using when you worked out the various primitive types you already convert in you FromField instances?

I'll have to think all this over on the weekend, I am sure I am pretty close to the "ah ha" moment but it has not happened yet :/

Kind Regards, Edward.

lpsmith commented 9 years ago

The Raw type and instance I gave above will show you the ascii syntax that postgresql is returning. (Unfortunately, all data is returned in a textual format with postgresql-simple... I'd like to fix this, but it's a little complicated.) Alternatively, the conversion error contains the syntax postgresql is returning, namely (0,1).

As I wrote the case statement, you don't need an instance Read Point; you would actually be using the read instance for (Double,Double), which is probably a close approximation of the syntax PostgreSQL returns for points.

So the type checker looks at this line:

Just (x,y) -> return (Point x y)

And deduces the expected return type of the expression being matched on is Maybe (Double, Double), because we would be matching on Maybe (_,_), whose components are then constrained to be Double by the fact that they are then inserted in the Point data constructor. It knows that readMaybe returns Maybe _, so it fills in the knowledge it gains from the first case statement to use the read instance for (Double, Double).