Open tysonzero opened 4 years ago
I really like this idea but I'd want to see how it plays out in practice, and ensure that type inference etc still works out relatively nicely.
To elaborate a bit more, I really do agree with the premise here, and it's been quite annoying to do in a large codebase. I've been thinking about the "right" way to handle this, but all the solutions to automagically handle it seem a bit too magical to me.
I am not a huge fan of the extra type variable, I would prefer something that "just works" without breaking any old code. As it is most people probably have references to SqlExpr in their code bases (I know that we do) and so adding that extra type param would create giant friction for what may appear to be a small gain. Obviously we shouldn't paralyze the library for fear of changing the public interface but I feel like there is a happy medium between too much magic and too little.
Perhaps we could have type families
type family SqlNullableResult (inputs :: [*]) result where
SqlNullableResult '[] r = r
SqlNullableResult _ (Maybe r) = Maybe r
SqlNullableResult ((Maybe _) ': _) r = Maybe r
SqlNullableResult (_ ': inputs) r = SqlNullableResult inputs r
type family SqlNullableValue v where
SqlNullableValue v = v
SqlNullableValue (Maybe v) = Maybe v
With this we open up the possibility to write a function where the maybeness of the results are dependent on the inputs.
(&&.) :: (b1 ~ SqlNullableValue Bool, b2 ~ SqlNullableValue Bool)
=> SqlExpr (Value b1)
-> SqlExpr (Value b2)
-> SqlExpr (Value (SqlNullableResult '[b1,b2] Bool))
(?.) :: (PersistEntity val, PersistField typ)
=> SqlExpr (Maybe (Entity val))
-> EntityField val typ
-> SqlExpr (Value (SqlNullableResult '[Maybe typ] typ))
This is of course a little more magical than passing around whether a thing is null but I feel that a library like esqueleto can bear this slightly more complex solution in order to prevent massive breaking changes.
All that said, I haven't done more than make sure that these types compile and pass the existing tests, it could prove that this approach is untenable but I personally feel that the cost of complexity is worth it.
After playing with this more a couple things I discovered.
(?.)
would be served well by a type level Join that could be implemented as
type family Join (m :: * -> *) a where
Join (m a) a = m a
Join a a = a
Then we can just return SqlExpr (Value (Join Maybe (Maybe typ)))
Though it may just be nicer to make a Maybe specific type family
type family SqlMaybe v where
SqlMaybe (Maybe v) = Maybe v
SqlMaybe v = Maybe v
which would allow the return type to be SqlExpr (Value (SqlMaybe typ))
A SqlBaseType type family may be useful
type family SqlBaseType a where
SqlBaseType (Maybe a) = a
SqlBaseType a = a
This could be used in constraints to say Bool ~ SqlBaseType bool
to allow for both Maybe and non maybe values. e.g.
(&&.) :: (Bool ~ SqlBaseType b1, Bool ~ SqlBaseType b2)
=> SqlExpr (Value b1)
-> SqlExpr (Value b2)
-> SqlExpr (Value (SqlNullableResult '[b1,b2] Bool))
I think the full dependent typing of the operators may be a bit overkill though, do we benefit by capturing the nullness of the results other than it being more accurate for modeling SQL?
The reason I ask is because during my experimenting I discovered that if we dependently type (==.) to potentially return a Maybe Bool
you need to update a bunch of stuff including WhereClause
and FromClause
and all of the magic surrounding OnClause
reordering. It seems like a whole bunch of work beyond just having the dependent type.
Final Thoughts:
It seems to me that the main pain points are in dealing with types of Value (Maybe (Maybe a))
and with being forced to coalesce in order to use &&.
and ||.
Therefore I propose that rather than making everything behave according to the rules of SQL we just update any thing that returns a Value (Maybe typ)
to instead return Value (SqlMaybe typ)
which will prevent the nested maybe. and to update the operators that are specified in terms of SqlExpr (Value Bool)
to instead be in terms of Bool ~ SqlBaseType bool => SqlExpr (Value bool)
I feel like these two changes provide the maximum benefit without hurting type inference or being hugely time intensive to change. So ultimately the type signatures would be:
(&&.) :: (Bool ~ SqlBaseType bool)
=> SqlExpr (Value bool)
-> SqlExpr (Value bool)
-> SqlExpr (Value bool)
and
(?.) :: (PersistEntity val, PersistField typ)
=> SqlExpr (Maybe (Entity val))
-> EntityField val typ
-> SqlExpr (Value (SqlMaybe typ))
There is a sqlType
function in the PersistFieldSql
class. Now it currently only works on the value level. However I think if it were lifted up into a type family and tagged with nullability it could be a lot more useful.
Then we would have:
(&&.) :: SqlType a ~ SqlBool n
=> SqlExpr (Value a)
-> SqlExpr (Value a)
-> SqlExpr (Value a)
This is pretty similar to what you have here, but could also be used for other purposes. It also would avoid any overlapping closed type families, which I personally prefer to avoid for extensibility purposes.
For example I would appreciate a function like:
cast :: SqlType a ~ SqlType b => SqlExpr (Value a) -> SqlExpr (Value b)
Currently I have to use veryUnsafeCoerceSqlExprValue
for perfectly safe things like converting from a FooId
to the underlying type.
So am I understanding correctly that your concern is that we will ultimately need a type level sqlType function anyways for something like cast to be typesafe and you don't want to have both that and this SqlBaseType?
More or less I suppose. I just think it's a nice solution to this problem and other related problems.
I also don't like the use of overlapping type families (even when closed).
Ok, so I hadn't heard the term "overlapping type families" before and thought you meant that the families would be duplicating logic. I think I understand now, I guess I am not familiar enough with type level programming to really see the downside of having the simple families that it allows. I was playing around with this and I don't think that lifting sqlType would actually be particularly helpful for cast, though it would allow a completely safe coerce function to be exposed(was this what you meant?). Here is a modified proposal that riffs on what you suggested while keeping the type families as simple as possible.
type IsSqlBool bool = 'SqlBool ~ SqlTypeLifted (Nullable bool)
(&&.) :: IsSqlBool a
=> SqlExpr (Value a)
-> SqlExpr (Value a)
-> SqlExpr (Value a)
The Nullable Type family is what was called SqlBaseType and then SqlTypeLifted is not aware of Maybe. I guess I am not fully understanding your nullable tag idea and how you would accomplish the implementation without the overlaps,I feel like this is a fairly happy medium where the Nullable type class should never have to be expanded(thereby alleviating concerns about maintaining it) since all it does is erase the Maybe for use in SqlTypeLifted.
EDIT: I wanted to see how a casting function could be made type safe. This is what I came up with:
type family IsMaybe t where
IsMaybe (Maybe t) = 'True
IsMaybe t = 'False
class SqlCastable (from :: SqlType) (to :: SqlType) where
instance SqlCastable 'SqlInt32 'SqlInt64 where
instance SqlCastable 'SqlInt32 'SqlReal where
instance SqlCastable 'SqlInt64 'SqlReal where
cast_ :: forall b a t t'. ( t ~ SqlTypeLifted (Nullable a)
, t' ~ SqlTypeLifted (Nullable b)
, IsMaybe a ~ IsMaybe b
, SqlCastable t t'
)
=> SqlExpr (Value a) -> SqlExpr (Value b)
Since the casts are on the sql side they are defined as legal in terms of the sql types.
Similarly coercion could be defined:
type SameSqlType a b = (SqlTypeLifted (Nullable a) ~ SqlTypeLifted (Nullable b), IsMaybe a ~ IsMaybe b)
coerce_ :: forall b a. SameSqlType a b => SqlExpr (Value a) -> SqlExpr (Value b)
coerce_ = veryUnsafeCoerceSqlExprValue
So we start with a library specifically designed around Sql and its type system, and not around Haskell. Both the nullability and the sql type are carried around and handled separately. The run
function would of course call into an actual SQL DB rather than manually executing it.
module SqlEdsl where
import Data.Text (Text)
data Type
= BoolT
| IntT
| TextT
data Nullability
= Nullable
| NotNull
data Expr (t :: Type) (n :: Nullability) where
LitX :: Value t n -> Expr t n
AndX :: Expr 'BoolT n -> Expr 'BoolT n -> Expr 'BoolT n
EqX :: Expr a n -> Expr a n -> Expr 'BoolT n
data Value (t :: Type) (n :: Nullability) where
NullV :: Value t 'Nullable
BoolV :: Bool -> Value 'BoolT n
IntV :: Int -> Value 'IntT n
TextV :: Text -> Value 'TextT n
run :: Expr t n -> Value t n
run (LitX v) = v
run (AndX a b) = case (run a, run b) of
(NullV, _) -> NullV
(_, NullV) -> NullV
(BoolV True, BoolV True) -> BoolV True
(BoolV _, BoolV _) -> BoolV False
run (EqX a b) = case (run a, run b) of
(NullV, _) -> NullV
(_, NullV) -> NullV
(BoolV x, BoolV y) -> BoolV (x == y)
(IntV x, IntV y) -> BoolV (x == y)
(TextV x, TextV y) -> BoolV (x == y)
justV :: Value t 'NotNull -> Value t 'Nullable
justV (BoolV x) = BoolV x
justV (IntV x) = IntV x
justV (TextV x) = TextV x
Next we provide a wrapper around that library to make it feel more like Esqueleto / Haskell, with things like Maybe
instead of a nullability tag. Since it utilizes the previous library, we know it can't form ill-typed SQL queries.
module EsqlEdsl
( Sqlable, GetType, GetNullability
, toValue, fromValue
, Expr, CaseNullability
, val, cast, (&&.), (==.), run
, Type(BoolT, IntT, TextT)
, Nullability(Nullable, NotNull)
) where
import Data.Text (Text)
import SqlEdsl
( Nullability(Nullable, NotNull)
, Type(BoolT, IntT, TextT)
, Value(NullV, BoolV, IntV, TextV)
, justV
)
import qualified SqlEdsl as S
class Sqlable a where
type GetType a :: Type
type GetNullability a :: Nullability
toValue :: a -> Value (GetType a) (GetNullability a)
fromValue :: Value (GetType a) (GetNullability a) -> a
instance Sqlable Bool where
type GetType Bool = 'BoolT
type GetNullability Bool = 'NotNull
toValue = BoolV
fromValue (BoolV x) = x
instance Sqlable Int where
type GetType Int = 'IntT
type GetNullability Int = 'NotNull
toValue = IntV
fromValue (IntV x) = x
instance Sqlable Text where
type GetType Text = 'TextT
type GetNullability Text = 'NotNull
toValue = TextV
fromValue (TextV x) = x
instance (Sqlable a, GetNullability a ~ 'NotNull) => Sqlable (Maybe a) where
type GetType (Maybe a) = GetType a
type GetNullability (Maybe a) = 'Nullable
toValue Nothing = NullV
toValue (Just x) = justV (toValue x)
fromValue NullV = Nothing
fromValue (BoolV x) = Just (fromValue (BoolV x))
fromValue (IntV x) = Just (fromValue (IntV x))
fromValue (TextV x) = Just (fromValue (TextV x))
data Expr a where
Expr :: S.Expr (GetType a) (GetNullability a) -> Expr a
type family CaseNullability (n :: Nullability) a b where
CaseNullability 'Nullable a _ = a
CaseNullability 'NotNull _ b = b
val :: Sqlable a => a -> Expr a
val = Expr . S.LitX . toValue
cast :: (GetType a ~ GetType b, GetNullability a ~ GetNullability b) => Expr a -> Expr b
cast (Expr x) = Expr x
(&&.) :: GetType a ~ 'BoolT => Expr a -> Expr a -> Expr a
Expr x &&. Expr y = Expr (S.AndX x y)
infixr 3 &&.
-- constraints trivially true when type families not stuck
(==.) :: ( GetNullability (CaseNullability (GetNullability a) (Maybe Bool) Bool) ~ GetNullability a
, GetType (CaseNullability (GetNullability a) (Maybe Bool) Bool) ~ 'BoolT
)
=> Expr a
-> Expr a
-> Expr (CaseNullability (GetNullability a) (Maybe Bool) Bool)
Expr x ==. Expr y = Expr (S.EqX x y)
infixr 4 ==.
run :: Sqlable a => Expr a -> a
run (Expr x) = fromValue (S.run x)
Now we test it with some examples.
import EsqlEdsl
data Boolish = Falseish | Trueish
instance Sqlable Boolish where
type GetType Boolish = 'BoolT
type GetNullability Boolish = 'NotNull
toValue Falseish = toValue False
toValue Trueish = toValue True
fromValue x = if fromValue x then Trueish else Falseish
main :: IO ()
main = do
let n, m :: Int
n = 2
m = 7
print . run $ val (Just n) ==. val (Just m)
print . run $ val n ==. val n
print . run $ val True &&. val False
print . run $ val (Just True) &&. val Nothing
print . run $ val True ==. cast (val Trueish)
Note how we have a working and safe cast
function, as well as an ==.
and an &&.
that both handle NULL values in a reasonable way. The output we get is:
Just False
True
False
Nothing
True
Hopefully this gives you a bit of an idea of what I was trying to get at.
Ok I think I understand, the part I was missing was the underlying dsl vs public interface. Even still, it seems like an awful lot of work just to avoid the now 3 simple type families. I thought the whole point of closed type families was the overlapping bit.
In general it seems that one of the guiding principles of this library has been to really rely on Persistent as much as possible. The nullability tagged dsl feels like a departure from that heritage.
While my proposed approach is less rigorous and likely even less correct than the approach you outlined it feels to me more like a middle way. Of course I am incredibly biased since it's the approach I proposed :).
Here is my latest iteration, it's main benefit is that it minimizes the changes to esqueleto and the type level stuff can fit nicely into Persistent by merging the SqlTypeable(shown below) and PersistFieldSql classes. It still utilizes the overlapping type families but they really simplify everything and really bring down the code necessary as well as divergence from persistent in my opinion.
type family Nullable t where
Nullable (Maybe t) = t
Nullable t = t
type family IsMaybe a where
IsMaybe (Maybe t) = 'True
IsMaybe t = 'False
class PersistFieldSql t => SqlTypeable t where
type SqlType' t :: SqlType
instance {-# OVERLAPPABLE #-} SqlTypeable t => SqlTypeable [t] where
type SqlType' [t] = 'SqlString
instance SqlTypeable [Char] where
type SqlType' [Char] = 'SqlString
instance SqlTypeable T.Text where
type SqlType' T.Text = 'SqlString
instance SqlTypeable TL.Text where
type SqlType' TL.Text = 'SqlString
instance SqlTypeable B.ByteString where
type SqlType' B.ByteString = 'SqlString
instance SqlTypeable Html where
type SqlType' Html = 'SqlString
instance SqlTypeable Bool where
type SqlType' Bool = 'SqlBool
instance SqlTypeable Int where
type SqlType' Int = 'SqlInt64
instance SqlTypeable Double where
type SqlType' Double = 'SqlReal
-- The instances are the same as PersistFieldSql and the classes should probably be merged
-- Constraint on runtime casting using the SQL CAST function (this would probably need to be defined seperately for each db)
class SqlCastable (from :: SqlType) (to :: SqlType) where
instance SqlCastable 'SqlInt64 'SqlInt32 where
instance SqlCastable 'SqlReal 'SqlInt32 where
instance SqlCastable 'SqlReal 'SqlInt64 where
-- Constraint for runtime coercion either due to using the same underlying type or having coercion supported without any issues in sql
class SqlCoercible (a :: SqlType) (b :: SqlType) where
instance SqlCoercible a a where
instance SqlCoercible 'SqlInt32 'SqlInt64 where
instance SqlCoercible 'SqlInt32 'SqlReal where
instance SqlCoercible 'SqlInt64 'SqlReal where
coerce_ :: forall b a t t'. ( t ~ SqlType' (Nullable a)
, t' ~ SqlType' (Nullable b)
, IsMaybe a ~ IsMaybe b
, SqlCoercible t t'
)
=> SqlExpr (Value a) -> SqlExpr (Value b)
coerce_ = veryUnsafeCoerceSqlExprValue
cast_ :: forall b a t t'. ( t ~ SqlType' (Nullable a)
, t' ~ SqlType' (Nullable b)
, IsMaybe a ~ IsMaybe b
, SqlCastable t t'
)
=> SqlExpr (Value a) -> SqlExpr (Value b)
cast_ = veryUnsafeCoerceSqlExprValue -- TODO: implement SQL CAST function here
Ok I think I understand, the part I was missing was the underlying dsl vs public interface. Even still, it seems like an awful lot of work just to avoid the now 3 simple type families.
So the underlying DSL doesn't actually need to exist. The important parts are the GetType
and GetNullability
type families. The underlying DSL is just a neat little proof of correctness.
I thought the whole point of closed type families was the overlapping bit.
When you are operating on Type
or *
, that is very often the usage yes. Which is why personally I avoid ever writing closed type families that operate on Type
or *
. I wish they weren't supported to be honest, similar to how we don't support runtime type introspection without a type class.
In general it seems that one of the guiding principles of this library has been to really rely on Persistent as much as possible. The nullability tagged dsl feels like a departure from that heritage.
As I mentioned the DSL was purely to help build out my proof of concept and prove its correctness. I think this could be done with a type family for nullability combined with making the existing sqlType
function work on the type level.
While my proposed approach is less rigorous and likely even less correct than the approach you outlined it feels to me more like a middle way.
So the biggest correctness issue with the Maybe
hardcoding stuff is that it will almost certainly have holes due to some non-Maybe types being able to output NULL
.
It's similar to how you can have a perfectly reasonable ToJSON
/FromJSON
pairing on Foo
, but if Foo
can be serialized to null
, the abstraction breaks when you start working with Maybe Foo
.
This is why I went so far to build the underlying DSL. I really wanted to be confident that there weren't any holes in the approach.
So the biggest correctness issue with the Maybe hardcoding stuff is that it will almost certainly have holes due to some non-Maybe types being able to output NULL.
I don't understand what case that would be. All operations that can introduce NULL should output a maybe. Maybe is strictly the tag for this thing can be null now in the esqueleto EDSL as it stands today and has no runtime meaning. There is one hole today related to outer joins not enforcing the maybeness but #172 closes that hole. Actually the Maybe is really the default and without a maybe all we are saying is that nothing has occured to this value which could result in null. Maybe is the nullability flag in persistent and esqueleto as it stands and only serves that purpose in the language, if something can be null without being tagged Maybe that would be a bug since the assumption is already being made elsewhere.
All that said there is no reason why SqlTypeable can't have an associated Nullability type, it just seems superfluous to me. I'm sorry if my ignorance betrays me, I am rather new to type level programming having only started on this journey in December of last year.
I hope I'm not being obtuse, but what is the exact problem with closed families on *? It just seems like my approach utilizing Maybe directly instead of a nullable datakind is much more straightforward.
EDIT: Pushed changes based on this discussion so far to https://github.com/bitemyapp/esqueleto/pull/178
I don't understand what case that would be.
All operations that can introduce NULL should output a maybe.
Persistent doesn't enforce this though. There is nothing stopping a dev serializing to and from null for their custom type, and there may be good reason for doing so. Also as mentioned nested Maybe
is already an example of the hole in incorrectly treating nullability as nestable.
I hope I'm not being obtuse, but what is the exact problem with closed families on *?
Specifically it's overlapping type families I dislike, and my dislike for it is the same reason I do like parametricity and purity.
When defining a custom data type I don't like it being prescribed meaning without me explicitly declaring as such. If I define data Option a = None | Some a
for whatever reason, I don't expect it to magically work differently than Maybe
because of a clever overlapping type family or type class that I wasn't aware existed.
It makes it harder for me to reason about my code, similar to how impurity and lack of parametricity make it harder for me to reason about my code.
Confusion around null has come up once again in #183. It seems that out joins enforcing nullability may exacerbate the poor handling of null. @tysonzero I fear that we are letting perfect get in the way of progress. @parsonsmatt do you have an opinion on my proposal to special case maybe with overlapping type families?
The approach used in rel8
looks quite promising for this: https://hackage.haskell.org/package/rel8-1.0.0.1/docs/src/Rel8.Schema.Null.html#Sql
I prefer the Haskell model of
null
, whereJust Nothing
andJust (Just Nothing)
are different values, and where you use<$>
and>>=
to operate on the underlying value.However with that said SQL does not take this approach, and thus trying to model nullability like the above is a very leaky abstraction. It also adds significant unneeded verbosity and a proliferation of functions that would otherwise be unnecessary.
One example downside is that calling functions/operators on nullable values can be quite painful, even when it would be quite painless in both SQL and regular Haskell:
The above cannot really be directly modeled in Esqueleto, so you typically have to do something weird like coalesce the
Bool
into non-nullable form and then operate on that, which is rather painful when the standard||
semantics is desired.Another example of the leakiness is that
joinV
is a no-op, which means even thoughisNothing
andisNothing . joinV
are very different in Haskell, they have the exact same semantics in Esqueleto.One example of verbosity from this is the very common need to use
joinV
in situations where no extra syntax would be needed in SQL. Similarlyjust
is also often needed in situations where no extra syntax would be needed in SQL.My proposal to account for this is admittedly quite a large change, so it may be too backwards breaking to be accepted, but I do think it should be considered.
I think that
SqlExpr
should actually be parameterized by two types, the first representing nullability, and the second representing the actual type itself.Functions that have defined semantics when interacting with null will then have their type generalized accordingly, for example
(||.) :: SqlExpr n Bool -> SqlExpr m Bool -> SqlExpr (n \/ m) Bool
and(^.) :: SqlExpr n (Entity v) -> EntityField m v t -> SqlExpr (n \/ m) (Value t)
.Functions that crash completely when interacting with null can then simply require
NotNullable
at the type level. When serializing to and from Haskell,Nullable
will naturally be mapped to and fromMaybe
.