On newtypes and type safety

0xd34df00d commented 5 years ago

Let's say I have a newtype for increased type safety at the client code level (btw kudos for making the Haskell type the last parameter of QueryRunnerColumnDefault so GND works!):

newtype PKeyId = PKeyId { getId :: Int } deriving (Eq, Show, QueryRunnerColumnDefault PGInt4)

instance Default Constant Int (Column col) => Default Constant PKeyId (Column col) where
  def = Constant $ constantExplicit def . getId

And I use it in a record like

data SomeRecord f = SomeRecord
  { tableId :: TableField f PKeyId PGInt4 NN Opt
  , problemDomainId :: TableField f Int PGInt4 NN Opt
  }

Then let's say I'm writing an update statement:

myUpdate :: PGS.Connection -> PKeyId -> IO ()
myUpdate conn idToUpdate = do
  runUpdate_ conn Update
    { uWhere = \row -> tableId row .== toFields idToUpdate
    , ...
    }

So far, so good — this code type checks and is working as intended.

The problem is if I make a mistake and try to compare problemDomainId rec .== toFields idToUpdate — this code will also type check, despite I'd like it to fail!

If I understand correctly, the problem is that .== operates in terms of Columns with Postgres types, and all the newtype information is lost by that time. Is there a way to alleviate this?

tomjaguarpaw commented 5 years ago

This comes up fairly often so I should address it specifically in the tutorial.

This is the offending line. You're allowing constant of a PKeyId to create columns of any type that can be created from an integer.

instance Default Constant Int (Column col) => Default Constant PKeyId (Column col) where

To fix this you have two options

You can use PKeyId as the type argument to the Columns you create. This has some associated boilerplate as you can no longer use GND.
Secondly, you can make PKeyId polymorphic.

{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE UndecidableInstances #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE TemplateHaskell #-}

import           Opaleye
import qualified Data.Profunctor.Product.Default
                                               as D
import qualified Data.Profunctor.Product.TH    as TH
import qualified Data.Profunctor               as P
import qualified Database.PostgreSQL.Simple    as PGS

-- The first option
newtype PKeyId = PKeyId { getId :: Int } deriving (Eq, Show)

-- The boilerplate
instance QueryRunnerColumnDefault PKeyId PKeyId where
  queryRunnerColumnDefault =
    queryRunnerColumn unsafeCoerceColumn
                      PKeyId
                      (queryRunnerColumnDefault :: QueryRunnerColumn SqlInt4 Int)

instance D.Default Constant PKeyId (Column PKeyId) where
  def = P.dimap getId unsafeCoerceColumn (D.def :: Constant Int (Field SqlInt4))

-- The second option
newtype PKeyIdP a = PKeyIdP { getIdP :: a } deriving (Eq, Show)

$(TH.makeAdaptorAndInstance "pKeyIdP" ''PKeyIdP)

-- Example to show that the second option works
runSelectPKey
  :: PGS.Connection -> Select (PKeyIdP (Field SqlInt4)) -> IO [PKeyIdP Int]
runSelectPKey = runSelect

0xd34df00d commented 5 years ago

Thanks for your reply! The second option looks way more reasonable (and concise) for me, but I still have a few questions about it.

Firstly, if I have a record Post having a field sourceId :: TableField f (HabrIdP Int) SqlInt4 NN Req (for HabrIdP defined similarly to PKeyIdP) and I try to write a function like

selectKnownPosts candidates = do
  Record { .. } <- selectTable P.postsTable -< ()
  restrict -< in_ candidates sourceId
  returnA -< sourceId

then the type of this function would be along the lines of [Column PGInt4] -> Select (Field PGInt4), while ideally I'd like something like [HabrIdP Int] -> Select (HabrIdP (Field PGInt4) or something along those lines.

Then there are more complicated contexts, but I'm probably already doing something wrong, so let's figure out this example first.

tomjaguarpaw commented 5 years ago

This is really a weakness of Opaleye's in_ function. It needs to be generalised to inMany as below. Could you check that inMany does what you want? I can add it to Operators.hs. In retrospect I would also suggest using type families style for HabrIdP.

{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE UndecidableInstances #-}
{-# LANGUAGE RecordWildCards #-}
{-# LANGUAGE Arrows #-}

import           Opaleye
import           Opaleye.TypeFamilies
import           Opaleye.Internal.Operators (EqPP)
import qualified Data.Profunctor.Product.Default as D
import           Control.Arrow
import qualified Data.Profunctor.Product       as PP
import qualified Data.Profunctor               as P

-- Define the newtype for your key type families style
newtype HabrIdP f = HabrIdP {
  getIdP :: TableField f Int SqlInt4 NN Req
  }

-- Unfortunately need this boilerplate, at least until
-- https://github.com/tomjaguarpaw/haskell-opaleye/pull/430 gets merged.
instance (D.Default p (TableField f Int SqlInt4 NN Req)
                      (TableField g Int SqlInt4 NN Req),
          PP.ProductProfunctor p) => D.Default p (HabrIdP f) (HabrIdP g) where
  def = HabrIdP PP.***$ P.lmap getIdP D.def

-- Then selectKnownPosts has the signature you'd want
selectKnownPosts :: Foldable f => f (HabrIdP O) -> Select (HabrIdP O)
selectKnownPosts candidates = proc () -> do
  Record { .. } <- selectTablePostsTable -< ()
  restrict -< inMany candidates sourceId
  returnA -< sourceId

-- Dummy for your own code
selectTablePostsTable :: Select (Post O)
selectTablePostsTable = undefined

data Post f = Record {
  sourceId :: HabrIdP f
  }

-- To be added to Opaleye
inMany :: (D.Default EqPP fields fields, Foldable f)
       => f fields -> fields -> Field SqlBool
inMany l x = foldr (\a b -> (a .=== x) .|| b) (sqlBool False) l

0xd34df00d commented 5 years ago

I believe another problem is that selectTable returns the type parametrized by O, which has PG types as opposed to Haskell types, so sourceId would also be PGIn4 or something like that, right?

I'll try with the TF approach. The immediate problem in my code is that I already have a few TH functions to avoid some of the boilerplate, but they expect a certain kind of the types of the arguments (basically whatever TableField itself is). Let's see if I will be able to work around that one.

Moreover, I'm not sure how to composeably define Post in my case then. I cannot use tableField with the TF-based newtype anymore to define the column for table, can I?

tomjaguarpaw commented 5 years ago

I believe another problem is that selectTable returns the type parametrized by O, which has PG types as opposed to Haskell types, so sourceId would also be PGIn4 or something like that, right?

I don't know what you mean. Could you clarify, maybe with an example?

Moreover, I'm not sure how to composeably define Post in my case then. I cannot use tableField with the TF-based newtype anymore to define the column for table, can I?

Hmm, maybe there are some rough edges when it comes to nesting HKD types, but you can certainly do:

-- Type definitions
newtype HabrIdP f = HabrIdP {
  getIdP :: TableField f Int SqlInt4 NN Req
  }

data Post f = Post {
  sourceId      :: HabrIdP f,
  somethingElse :: TableField f String SqlText NN Req
  }

-- Boilerplate
pHabrIdP :: PP.ProductProfunctor p
         => HabrIdP (p :<$> a :<*> b)
         -> p (HabrIdP a) (HabrIdP b)
pHabrIdP i = HabrIdP PP.***$ P.lmap getIdP (getIdP i)

pPost :: PP.ProductProfunctor p
      => Post (p :<$> a :<*> b)
      -> p (Post a) (Post b)
pPost post  = Post PP.***$ P.lmap sourceId      (pHabrIdP (sourceId post))
                   PP.**** P.lmap somethingElse (somethingElse post)

-- Simple table definition
postTable :: Table (Post W) (Post O)
postTable = table "postTable" $ pPost $ Post {
     sourceId      = HabrIdP { getIdP = tableField "habrid" }
   , somethingElse = tableField "somethingelse"
 }

jaredramirez commented 4 years ago

I took a slightly different approach to solving this problem. Rather than create a newtype to provide this kind of type-safety, I went with the approach of creating a new sql type. From the original example, we can do:

import qualified Data.Profunctor.Product.Default as D
import Data.Profunctor.Product.Default (def)
import Data.UUID.Types (UUID)
import qualified Opaleye

-- Type definitions

data SqlPKeyId

data SomeRecord f = SomeRecord
  { tableId :: TableField f Int SqlPKeyId NN Opt
  , problemDomainId :: TableField f Int PGInt4 NN Opt
  }

-- Boilerplate

instance D.Default Opaleye.Constant Int (Opaleye.Column SqlPKeyId) where
  def = aliasIntColumn

instance Opaleye.QueryRunnerColumnDefault SqlPKeyId Int where
  queryRunnerColumnDefault =
    Opaleye.fieldQueryRunnerColumn

-- Helpers

aliasIntColumn :: Opaleye.Constant Int (Opaleye.Column sqlType)
aliasIntColumn =
  P.dimap identity Opaleye.unsafeCoerceColumn def_
  where
    def_ :: Opaleye.Constant Int (Opaleye.Column Opaleye.PGInt4)
    def_ =
      def

Now, if you try to compare different IDs in a query, you'd get something like:

    • Couldn't match type ‘Opaleye.PGInt4’ with ‘SomeTable.SqlPKeyId’
      Expected type: Opaleye.Column SomeTable.SqlPKeyId
        Actual type: Opaleye.Internal.TypeFamilies.TableRecordField
                       Opaleye.Internal.TypeFamilies.O
                       Int
                       Opaleye.PGInt4
                       Opaleye.Internal.TypeFamilies.NN
                       Opaleye.Internal.TypeFamilies.Opt

This seems more convenient to me over newtypes, because now you can write

...
Opaleye.restrict -<
SomeTable.id someTableRow
     .== OtherTable.someTableId otherTableRow
...

instead of

...
Opaleye.restrict -<
  (SomeTable.getId . SomeTable.id) someTableRow
      .== (SomeTable.getId . OtherTable.someTableId) otherTableRow
...

Caveat are:

It uses unsafeCoerceColumn, which might not be ideal. Though I think it's completely safe in this situation.
Once you run your query and you're dealing with haskell types, you loose any type-safety around the ids. This seems generally okay (at least to me) because my primary concern is safety in comparing sql columns (joins and where clauses).

You can something like:

newtype HabrIdP f = HabrIdP {
  getIdP :: TableField f Int SqlPKeyId NN Req
  }

and get the best of both worlds.

tomjaguarpaw commented 4 years ago

FYI, if you go the newtype way you can use .=== instead of .== to avoid unwrapping the newtype.

tomjaguarpaw commented 4 years ago

(I may generalise the functionality of .== to that of .=== because I don't think there's any point keeping it restricted)

tomjaguarpaw / haskell-opaleye

On newtypes and type safety #431