tomjaguarpaw / haskell-opaleye

Other
605 stars 115 forks source link

Thoughts on some types #24

Closed ocharles closed 8 years ago

ocharles commented 9 years ago

Hi, this issue is going to be a little rambly, but hopefully I can provide some useful information....

As I've mentioned before, I don't really like the way records interact in Opaleye. data Foo a b c d e is not idiomatic Haskell, and it makes adding Opaleye in later difficult. However, the benefit of having named accessors is an indispensable win - so whatever I suggest needs to be able to provide this.

The first observation is that we're doing things this way in order to "vary the idiom". Sometimes we are working with a table where everything is a Column, other times everything is in TableProperties, and other times everything is in Identity (you've executed the query).

This notion comes up a lot - we see it once in generics-sop (all products/sums are parametrized by a functor) and again in vinyl. I would wager that it's possible to achieve what we have at the moment using vinyl - we get named fields and thus order-independent access/construction. Along with this, the specification of tables becomes easier - all you need to do is list the types (it's unclear whether you would list PostgreSQL types or Haskell types), and you only need to do that once. We're then moving towards something like

someTable :: Table '[Int, String, String, UTCTime]

The next observation is that the type inference is pretty hairy - and I'd say that's almost entirely due to Default. With the above idea you now have an inductive structure that you can meaningfully fold over. Running a query simply swaps out the functor from Column to something like:

data PostgreSQLSimple :: * -> * where
  PostgreSQLSimple :: FromField a => PostgreSQLSimple a

(there are details here as to when you're able to capture this constraint, but I can't see any reason it can't be done, off hand).

Again, just by varying the functor you get more useful properties. A left join should be inferrable now, because a left join simply changes your functor from f to Compose Nullable f, perhaps joining Nullable (Nullable a) along the way.

This is bit of a sketch of ideas, but hopefully I'll find time to help contribute some of this work.

Has anyone explored any of these ideas?

bergmark commented 9 years ago

Did you take aggregations into account?

Have you looked at what our silk-opaleye package generates? Type inference can still be an issue there, but it's usually very easy to write type signatures beforehand, such as Query (To Nullable (To Column User)). You only need to define one data type with our solution, and TH then gives you a bunch of types for free.

If you are interested you can -ddump-splices this file https://github.com/silkapp/silk-opaleye/blob/master/example/Example.hs (also note that our dep opaleye-0.3.0.0.100 is interchangeable with opaleye-0.3.1)

ocharles commented 9 years ago

It looks like aggregations follow much the same pattern - it's just a different choice of functor, in that case the Aggregate functor.

I like the look of the TH stuff, and that might be ok for the meantime (though I've already reflected our database into Opaleye types now - wish I knew about it earlier!)

bergmark commented 9 years ago

Regarding whether to list haskell or opaleye types, we enforce a many-to-one mapping from haskell to opaleye and therefore specify the haskell types. I don't see a reason to change that.

ocharles commented 9 years ago

Yea, I think that is the only sensible way to do it.

ocharles commented 9 years ago

Oh, I forgot to mention one more thing, though it's possible regardless of these ideas. Once you run a query you should end up in the Identity functor. At this point, if everything is Identity and translated to haskell types, it should be possible to use generics to cast from the functor-parameterized type to the underlying Haskell record (in the "obvious" cases where the fields line up directly).

tomjaguarpaw commented 9 years ago

I am very much in favour of new ideas to improve the story around the record types we use in the Opaleye API. Any new approach will have to

I'm sure both of your know this, but for the benefit of others: There is nothing in Opaleye stopping third parties from implementing their own favourite record solution. All the necessary combinators are fully general and not restricted to polymorphic product types or the Default typeclass. (Or at least they should be. If not, please file a bug!)

tomjaguarpaw commented 9 years ago

@ocharles If running the query ends up with all the fields in the Identity functor what do you do before using in the rest of your Haskell code? An Applicative operation to unwrap them all?

ocharles commented 9 years ago

You would have some sort of function that expects an Identity functor and strips it out entirely. This is a bit like Rep and to from generics-sop.

tomjaguarpaw commented 9 years ago

Right, I suspect generics could be very helpful for Opaleye-related datatypes, but I don't really know anything about it.

tomjaguarpaw commented 9 years ago

Oops didn't mean to close!

tel commented 9 years ago

Any luck with this? I'm investigating using Opaleye for a project and would like to see if Vinyl could be used productively. I feel that the "holey" record types are a pretty significant departure from normal Haskell types and I'd rather they live in a parallel world to my Haskell types where I can freely wrap functors around the values.

In a larger context, any guidance on best practices for Opaleye-ready type design would be appreciated :)

tomjaguarpaw commented 9 years ago

I've never tried Vinyl with Opaleye, but often thought it could be very useful.

It's important to reiterate that there's nothing in Opaleye stopping you from whatever datatypes you like, it's just that a lot of "generic programming" is folded into the Default typeclass and it's been easiest so far to use that with the polymorphic product types.

I would suggest trying to make Tables of some vinyl types by hand and then seeing if you can abstract out the commonality. Likewise try to make some QueryRunners by hand and abstract out the commonality.

Getting Vinyl to place nicely with Opaleye would be very cool!

tel commented 9 years ago

I was able to get a little bit of this going, but it's impeded a bit by lack of injective type families.

First we have the generic product profunctor commutator for records. We'll need a little tool called a universe map (sort of, not really, that is a bad name for it)

newtype UMap p (f :: k -> *) (g :: k -> *) (r :: k) = UMap (p (f r) (g r))

and then we can write pRec as a typeclass method

class RecProfunctorProduct rs where
  pRec :: ProductProfunctor p => Rec (UMap p f g) rs -> p (Rec f rs) (Rec g rs)

instance RecProfunctorProduct '[] where
  pRec RNil = dimap (const ()) (const RNil) empty

instance RecProfunctorProduct rs => RecProfunctorProduct (r ': rs) where
  pRec (UMap f :& b) = dimap splitr smashr (f ***! pRec b) where
    splitr :: forall f r rs . Rec f (r ': rs) -> (f r, Rec f rs)
    splitr (r :& rs) = (r, rs)
    smashr :: forall g r rs . (g r, Rec g rs) -> Rec g (r ': rs)
    smashr (r, rs) = r :& rs

If we write a schema mapping in Haskell for a table in our database we can connect column names and types in Haskell's type system. This is a bit ugly because of noted lack of injectivity, so it'd be nice to kill a little duplication here

type family SiteType a :: * where
  SiteType "site_id" = Int
  SiteType "name" = Text
  SiteType "principal_id" = String
  SiteType "as_of" = UTCTime

newtype SiteSchema a = SiteSchema (SiteType a)
newtype SiteColumn a = SiteColumn (Column (SiteSchema a))

Now we write

col :: forall s. KnownSymbol s => UMap TableProperties SiteColumn SiteColumn s
col = UMap (dimap (\(SiteColumn x) -> x)
                  SiteColumn
                  (required (symbolVal (Proxy :: Proxy s))))

for a mostly generic required column marker and define our table schema all at once

tableSite :: Table (Rec SiteColumn SiteColumns) (Rec SiteColumn SiteColumns)
tableSite = Table "Site" $ pRec $ col :& col :& col :& col :& RNil

so that's cool. We could maybe slightly improve it by using the length of the type-list to drive the construction of the record, but that's nbd.

And what if you have nullable columns? Well, that's just bad database design, right? ;)

tomjaguarpaw commented 9 years ago

Ok, so it seems you got somewhere good. What about nullable columns though? Do they not fit into this?

tel commented 9 years ago

You could probably encode that in the schema name listing, something like [NotNull "site_id", Null "name"] but it'd add overhead on non-null databases. You could also name it directly in the "schema" type family somehow, I suppose, but that's where the duplication starts to build.

tel commented 9 years ago

Here's the default machinery defined over the whole Rec type

class DefaultRec p f g as bs where
  defRec :: p (Rec f as) (Rec g bs)

instance ProductProfunctor p => DefaultRec p f g '[] '[] where
  defRec = dimap (const ()) (const RNil) empty

instance
  ( DefaultRec p f g as bs
  , ProductProfunctor p
  , Default p (f a) (g b)
  ) =>
    DefaultRec p f g (a ': as) (b ': bs) where

  defRec = dimap up down (def ***! defRec) where
    up :: Rec f (a ': as) -> (f a, Rec f as)
    up (a :& r) = (a, r)
    down :: (g b, Rec g bs) -> Rec g (b ': bs)
    down (b, r) = b :& r

So we still need to use the *Explicit versions, but at least they're all just queryTableExplicit defRec.

tomjaguarpaw commented 9 years ago

Cool! What goes wrong if you try to define a Default instance with def = defRec?

tel commented 9 years ago

There's not really any way to formulate that.

It'd have to look something like

instance (ProductProfunctor p, Default p a b) => Default p (Rec f as) (Rec g bs) where
  ...

but there's no way to establish a relationship between a and as, b and bs. You could rectify it much the same way you do with the pN combinators by having 16 or so instances... but that feels less good.

tomjaguarpaw commented 9 years ago

I don't know anything about Vinyl, but I'm surprised you can't do this recursively, like

instance (ProductProfunctor p, Default p a b, Default p (Rec f as) (Rec g bs))
      => Default p (Rec f (a ': as)) (Rec g (b ': bs))

or so

tel commented 9 years ago

Oh, well, what do you know. I convinced myself that wouldn't work earlier... but indeed it does

instance
  ( ProductProfunctor p
  , Default p (f a) (g b)
  , Default p (Rec f as) (Rec g bs)
  ) =>
    Default p (Rec f (a ': as)) (Rec g (b ': bs)) where

  def = dimap up down (def ***! def) where
    up :: Rec f (a ': as) -> (f a, Rec f as)
    up (a :& r) = (a, r)
    down :: (g b, Rec g bs) -> Rec g (b ': bs)
    down (b, r) = b :& r

instance ProductProfunctor p => Default p (Rec f '[]) (Rec g '[]) where
  def = dimap (const ()) (const RNil) empty
tomjaguarpaw commented 9 years ago

This is incredibly cool! Please let me know if this is practically usable as a way to get Opaleye working with Vinyl. I think that would be a big win for both libraries!

tel commented 9 years ago

So I was a bit worried for a minute that I was sunk, but then I wrote this miraculous instance:

instance

  (Profunctor p, Default p a b, KnownSymbol s) =>
    Default p (ElField '(s, a)) (ElField '(s, b))

  where def = dimap getField Field def

Here's some of the basic tutorial


-- lib
type Table' a = Table a a
type RTable a b = Table (FieldRec a) (FieldRec b)
type RTable' a = RTable a a

type RQuery rs = Query (FieldRec rs)

-- blehhhhhh
recReq :: KnownSymbol s => String -> Lift TableProperties ElField ElField '(s, Column a)
recReq s = Lift (dimap getField Field (required s))

-- tut

type Person =
  '[ '("name", Column PGText)
   , '("age", Column PGInt4)
   , '("address", Column PGText)
   ]

sname :: SField '("name", a) -- boilerplate-y, I know
sname = SField

sage :: SField '("age", a)
sage = SField

saddress :: SField '("address", a)
saddress = SField

personTable :: RTable' Person
personTable = Table "personTable" $ pRec $ recReq "name"
                                        :& recReq "age"
                                        :& recReq "address"
                                        :& RNil

personQuery :: RQuery Person
personQuery = queryTable personTable

nameAge :: RQuery ['("name", Column PGText), '("age", Column PGInt4)]
nameAge = proc () -> do
  p <- personQuery -< ()
  returnA -< (rget sname p :& rget sage p :& RNil)

That last line looks a little ungainly compared to the tuples used otherwise, but it has a secret!

project :: (rs' ⊆ rs) => QueryArr (FieldRec rs) (FieldRec rs')
project = arr rcast

Type-directed projections!

nameAge2 :: RQuery ['("name", Column PGText), '("age", Column PGInt4)]
nameAge2 = personQuery >>> project
tel commented 9 years ago

This method still causes a royal mess for optional queries, though.

tel commented 9 years ago

Also inference issues arise

youngPeople :: RQuery Person
youngPeople = proc () -> do
  row <- personQuery -< ()
  restrict -< age row .<= 18
  returnA -< row

  where
    age :: FieldRec Person -> Column PGInt4
    age = getField . rget sage

the age bit needs to be called out explicitly since sage doesn't restrict its target by type, only by name. We could restrict it by type as well and then this would work fine, but to do so would require a new set of singletons for accessing the "Haskell-like" typed records.

This only occurs due to the extra indirection through PGNum, though. We're fine with stricter restrictions.

    restrict -< (getField . rget saddress) row .== pgString "1 My Street, My Town"
tel commented 9 years ago

And wait. Nullable doesn't appear to apply at the Haskell level very often. I think the optional combinator was confusing me. These examples work:

type Employee =
  '[ '("name", Column PGText), '("boss", Column (Nullable PGText)) ]

employeeTable :: Table' (FieldRec Employee)
employeeTable = Table "employeeTable" . pRec $ recReq "name"
                                            :& recReq "boss"
                                            :& RNil

sboss :: SField '("boss", a)
sboss = SField

hasBoss :: Query (Column PGText)
hasBoss = proc () -> do
  row <- queryTable employeeTable -< ()
  let name = getField (rget sname row)
  let boss = getField (rget sboss row) :: Column (Nullable PGText) -- type ornament needed here
  let aOrNo = ifThenElse (isNull boss) (pgString "no") (pgString "a")
  returnA -< name .++ pgString " has " .++ aOrNo .++ pgString " boss"
tel commented 9 years ago

(Worth noting that we could access positionally, too, just like with tuples, but I'm trying to push with the lens access.)

tel commented 9 years ago

Something is gummed up about aggregations, though. I can't be sure whether this is an Opaleye misunderstangind or a Vinyl-machinations bug.

    Couldn't match type ‘PGText’ with ‘PGInt8’
    Expected type: Table
                     (FieldRec Widget)
                     (Rec
                        ElField
                        '['("style", Column PGText), '("color", Column PGText),
                          '("location", Column PGInt8), '("quantity", Column PGInt4),
                          '("radius", Column PGFloat8)])
      Actual type: RTable' Widget
    In the first argument of ‘queryTable’, namely ‘widgetTable’
    In the second argument of ‘aggregate’, namely
      ‘(queryTable widgetTable)’
type Widget =
  '[ '("style", Column PGText)
   , '("color", Column PGText)
   , '("location", Column PGText)
   , '("quantity", Column PGInt4)
   , '("radius", Column PGFloat8)
   ]

widgetTable :: RTable' Widget
widgetTable = recTable "wigetTable" -- this is cute!

type family ReplaceTypes as bs :: [ (Symbol, *) ] where
  ReplaceTypes '[] bs = '[]
  ReplaceTypes as '[] = '[]
  ReplaceTypes ( '(name, x) ': as) (ty ': bs) = ( '(name, ty) ': ReplaceTypes as bs)

type WidgetSummary = ReplaceTypes Widget
  '[ Column PGText
   , Column PGText
   , Column PGInt8
   , Column PGInt4
   , Column PGFloat8
   ]

aggregateWidgets :: RQuery WidgetSummary
aggregateWidgets =
  aggregate (pRec $ liftField groupBy
                 :& liftField groupBy
                 :& liftField count
                 :& liftField Opaleye.sum
                 :& liftField avg
                 :& RNil)
            (queryTable widgetTable)
tomjaguarpaw commented 9 years ago

Your pRec doesn't allow the input and output types to differ. I guess you want something more like

class RecProfunctorProduct rs rs' where
  pRec :: ProductProfunctor p => Rec (UMap p f g) rs -> p (Rec f rs) (Rec g rs')
  -- ^^ This needs to be adjusted somehow to allow rs' in the argument

or perhaps some other cunning way of achieving the same end. I'm a bit lost with the typelevel programming.

tel commented 9 years ago

Yeah, I think I stumbled into that a few more times. It's a workable type, but not sufficiently flexible since it tries to handle the variation between input and output uniformly across the whole record instead of just at one or two locations. I'm trying a new approach to pRec which is a bit differently designed now... unfortunately the type level stuff is stumping me now as well!

tel commented 9 years ago

Ah, ha ha. Sometimes I think I need to know better when to stop:

recTable :: forall r l .
            (TupleIso l, TupleIso r, NamesOf l ~ NamesOf r,
             SplitRec' TableProperties (ReqTableSpec l),
             TableSpec (ReqTableSpec l),
             NamesOf (ReqTableSpec l) ~ NamesOf l,
             Sources (ReqTableSpec l) ~ TupleLike l,
             Targets (ReqTableSpec l) ~ TupleLike r) =>
     String -> Table (FieldRec l) (FieldRec r)
recTable s = Table s (splitRec (tableSpec :: FieldRec (ReqTableSpec l)))

That hunk of garbage will automatically intuit the shape of a Table assuming each entry in the schema is required (e.g. the in and out sides of the table spec are identical)

personTable :: RTable' Person
personTable = recTable "personTable"

It does so in a way that supports aggregates and, essentially, tables with different input and output columns. Aggregation works as anticipated.

type WidgetSummary = ReplaceTypes Widget
  '[ Column PGText
   , Column PGText
   , Column PGInt8
   , Column PGInt4
   , Column PGFloat8
   ]

aggregateWidgets :: RQuery WidgetSummary
aggregateWidgets =
  aggregate (splitRec $ Field (P groupBy)
                     :& Field (P groupBy)
                     :& Field (P count)
                     :& Field (P Opaleye.sum)
                     :& Field (P avg)
                     :& RNil)
            (queryTable widgetTable)

Or not at all as you anticipated, depending on what you're anticipating.

> putStrLn $ showSqlForPostgres aggregateWidgets
SELECT "result0_2" as "result1_3",
       "result1_2" as "result2_3",
       "result2_2" as "result3_3",
       "result3_2" as "result4_3",
       "result4_2" as "result5_3"
FROM (SELECT *
      FROM (SELECT "style0_1" as "result0_2",
                   "color1_1" as "result1_2",
                   COUNT("location2_1") as "result2_2",
                   SUM("quantity3_1") as "result3_2",
                   AVG("radius4_1") as "result4_2"
            FROM (SELECT *
                  FROM (SELECT "style" as "style0_1",
                               "color" as "color1_1",
                               "location" as "location2_1",
                               "quantity" as "quantity3_1",
                               "radius" as "radius4_1"
                        FROM "wigetTable" as "T1") as "T1") as "T1"
            GROUP BY "style0_1",
                     "color1_1") as "T1") as "T1"
tomjaguarpaw commented 9 years ago

So there's a crazy, messy, long type, but confined to a small number of places?

tel commented 9 years ago

That's my hope.

On the other hand, if you give up completely automatic table generation then it looks better

reqCol :: forall s a . KnownSymbol s => ElField '(s, P TableProperties (Column a) (Column a))
reqCol = Field (P (required (symbolVal (Proxy :: Proxy s))))

personTable2 :: RTable' Person
personTable2 = Table "personTable" (splitRec $ reqCol :& reqCol :& reqCol :& RNil)
tomjaguarpaw commented 9 years ago

Well I certainly admire your magical powers of the type level!

tel commented 9 years ago

I've been getting things down a bit better (and stealing some ideas from @hesselink and @ocharles). Here's a recent example


newtype UserId = UserId { unUserId :: Text } deriving (Show, Pg.FromField)
newtype SiteId = SiteId { unSiteId :: Int } deriving (Show, Pg.FromField)

type CSiteId = '("site_id", SiteId)
type CPrincipalId = '("principal_id", UserId)
type CName = '("name", Text)
type CAsOf = '("as_of", UTCTime)

type SiteS = '[CSiteId, CPrincipalId, CName, CAsOf]

siteId :: SField CSiteId
siteId = SField

principalId :: SField CPrincipalId
principalId = SField

name :: SField CName
name = SField

asOf :: SField CAsOf
asOf = SField

siteTable :: RTable (To Maybe (ToSchema SiteS)) (ToSchema SiteS)
siteTable = recordTable "Site"

siteQuery :: RQuery (ToSchema SiteS)
siteQuery = queryTable siteTable

siteIds :: RQuery (ToSchema '[CSiteId])
siteIds = proc () -> do
  r <- siteQuery -< ()
  returnA -< col siteId =: getCol siteId r

main :: IO ()
main = do
  conn <- Pg.connect (Pg.defaultConnectInfo
                       { Pg.connectDatabase = "fisher"
                       , Pg.connectUser = "tel"
                       })
  res <- runQuery conn siteQuery :: IO [FieldRec SiteS]
  mapM print res
  Pg.close conn
tel commented 9 years ago

There are about 10 lines in the middle there that are raw boilerplate, but I'm not eager to jump to TH yet.

tel commented 9 years ago

I also wish inference was good enough to not need a annotation on the query. That's rough.

tomjaguarpaw commented 9 years ago

Renzo @k0001 seems to be doing something similar: http://ren.zone/articles/opaleye-sot

tel commented 9 years ago

Oh, cool! I saw that linked at /r/haskell and was going to get around to reading it soon.

Joseph Abrahamson

@tel / sdbo / jspha.com

On Mon, Oct 19, 2015 at 4:44 AM, tomjaguarpaw notifications@github.com wrote:

Renzo @k0001 seems to be doing something similar: http://ren.zone/articles/opaleye-sot

Reply to this email directly or view it on GitHub: https://github.com/tomjaguarpaw/haskell-opaleye/issues/24#issuecomment-149147637

ocharles commented 8 years ago

I should note that I've taken my explorations to https://github.com/ocharles/opaleye-tf - this issue isn't really very well defined, so maybe it should be closed? There's some great discussion here though, so thanks to all involved for contributing ideas!

tomjaguarpaw commented 8 years ago

Sure, feel free to reopen if you need to though.