Open o1lo01ol1o opened 6 years ago
How does this look?
{-# LANGUAGE AllowAmbiguousTypes, DataKinds, FlexibleContexts,
FlexibleInstances, MultiParamTypeClasses, PolyKinds,
RankNTypes, ScopedTypeVariables, TypeApplications,
TypeFamilies, TypeOperators #-}
module PushMaybe where
import Data.Vinyl
import Data.Vinyl.Functor
import Data.Vinyl.TypeLevel
import Lens.Micro
import GHC.TypeLits
-- The example record we will use for testing
testRec :: Rec (Maybe :. ElField) '["id" ::: Int, "name" ::: String, "age" ::: Int]
testRec = xrec (Just 19, Just "Steve", Nothing)
-- | Apply Maybe to the payload of a record field type
type family PushMaybe (x :: k) :: k where
PushMaybe (s ::: x) = s ::: Maybe x
PushMaybe (f :. g) = f :. PushMaybe g
PushMaybe x = Maybe x
-- | Apply Maybe to the payload of each field type in a list
type family PushMaybes (xs :: [k]) :: [k] where
PushMaybes '[] = '[]
PushMaybes (x ': xs) = PushMaybe x ': PushMaybes xs
-- | Something in the spirit of 'traverse' for 'ElField' whose kind
-- fights the standard library.
traverseField :: (KnownSymbol s, Functor f)
=> (a -> b) -> f (ElField '(s,a)) -> ElField '(s, f b)
traverseField f t = Field (fmap (f . getField) t)
-- | Traverse a record applying a function. We can't use 'rmapMethod'
-- because we aim to change the type index.
class PushMaybeC f rs where
pushMaybe :: Rec (Maybe :. f) rs -> Rec f (PushMaybes rs)
instance PushMaybeC f '[] where
pushMaybe RNil = RNil
instance (PushMaybeC ElField xs, KnownSymbol s) => PushMaybeC ElField ((s ::: a) ': xs) where
pushMaybe (Compose x :& xs) = traverseField id x :& pushMaybe xs
-- Manually changing the type of id
testRec2 :: Rec ElField '["id" ::: Int, "name" ::: Maybe String, "age" ::: Maybe Int]
testRec2 = pushMaybe testRec & rlens' @("id" ::: Maybe Int) %~ fieldMap fromJust
where fromJust Nothing = error "non-nullable id was null!"
fromJust (Just x) = x
-- Now let's build up the tools we need to do it once and for all
-- | Remove a 'Maybe' type constructor application from the first
-- field with the given name.
type family UnMaybe name xs where
UnMaybe name ((name ::: Maybe a) ': xs) = (name ::: a) ': xs
UnMaybe name (a ': xs) = a ': UnMaybe name xs
-- | Push the 'Maybe' interpretation into the field types while
-- asserting that a particular named field is not 'Nothing'. This can
-- fail if the non-nullable field was in fact null!
nonNullable :: forall n rs t.
(t ~ FieldType n rs,
RecElem Rec
(n ::: Maybe t)
(n ::: t)
(PushMaybes rs)
(UnMaybe n (PushMaybes rs))
(RIndex (n ::: Maybe t) (PushMaybes rs)),
PushMaybeC ElField rs)
=> Rec (Maybe :. ElField) rs -> Rec ElField (UnMaybe n (PushMaybes rs))
nonNullable = (rlens' @(n ::: Maybe t) %~ fieldMap fromJust) . pushMaybe
where fromJust Nothing = error "non-nullable id was null!"
fromJust (Just x) = x
testRec3 :: Rec ElField '["id" ::: Int, "name" ::: Maybe String, "age" ::: Maybe Int]
testRec3 = nonNullable @"id" testRec
Thanks! That looks good. What are your thoughts about integrating it into the frames api? I've since needed it in 4 projects and would like to avoid the duplication of code.
Then it definitely goes in! Question: should we take a list of names to mark non-nullable?
That seems sensible, yes.
Here's how it looks now with vinyl
HEAD
{-# LANGUAGE AllowAmbiguousTypes, DataKinds, FlexibleContexts,
FlexibleInstances, MultiParamTypeClasses, PolyKinds,
ScopedTypeVariables, TypeApplications, TypeFamilies,
TypeOperators, UndecidableInstances,
UndecidableSuperClasses #-}
module PushMaybe where
import Data.Vinyl
import Data.Vinyl.Functor
import Data.Vinyl.TypeLevel
import Lens.Micro
import GHC.Types (Type, Constraint, Symbol)
-- The example record we will use for testing
testRec :: Rec (Maybe :. ElField) '["id" ::: Int, "name" ::: String, "age" ::: Int]
testRec = xrec (Just 19, Just "Steve", Nothing)
-- Manually changing the type of id
testRec2 :: Rec ElField '["id" ::: Int, "name" ::: Maybe String, "age" ::: Maybe Int]
testRec2 = rsequenceInFields testRec
& rlens' @("id" ::: Maybe Int) %~ fieldMap fromJust
where fromJust Nothing = error "non-nullable id was null!"
fromJust (Just x) = x
-- Now let's build up the tools we need to do it once and for all
-- | Remove a 'Maybe' type constructor application from the first
-- field with the given name.
type family UnMaybe name xs where
UnMaybe name ((name ::: Maybe a) ': xs) = (name ::: a) ': xs
UnMaybe name (a ': xs) = a ': UnMaybe name xs
-- | Push the 'Maybe' interpretation into the field types while
-- asserting that a particular named field is not 'Nothing'. This can
-- fail if the non-nullable field was in fact null!
nonNullable :: forall n rs t.
(t ~ FieldType n rs,
AllFields rs,
RMap rs,
RecElem Rec
(n ::: Maybe t)
(n ::: t)
(MapTyCon Maybe rs)
(UnMaybe n (MapTyCon Maybe rs))
(RIndex (n ::: Maybe t) (MapTyCon Maybe rs)))
=> Rec (Maybe :. ElField) rs
-> Rec ElField (UnMaybe n (MapTyCon Maybe rs))
nonNullable = (rlens' @(n ::: Maybe t) %~ fieldMap fromJust) . rsequenceInFields
where fromJust Nothing = error "non-nullable id was null!"
fromJust (Just x) = x
testRec3 :: Rec ElField '["id" ::: Int, "name" ::: Maybe String, "age" ::: Maybe Int]
testRec3 = nonNullable @"id" testRec
-- Support lists of non-nullable fields
-- | Strip a 'Maybe' type constructor from fields we assert to be
-- 'Just'.
class NonNullCtx fs rs => NonNullable (fs :: [Symbol]) (rs :: [(Symbol,Type)]) where
type NonNull fs rs :: [(Symbol,Type)]
type NonNullCtx fs rs :: Constraint
nonNullable' :: NonNullCtx fs rs => Rec ElField rs -> Rec ElField (NonNull fs rs)
instance NonNullable '[] rs where
type NonNull '[] rs = rs
type NonNullCtx '[] rs = ()
nonNullable' = id
-- | Helper to write a long 'RecElem' constraint.
type family RecElemAux n t rs where
RecElemAux n (Maybe t) rs = RecElem Rec (n ::: Maybe t) (n ::: t)
rs (UnMaybe n rs)
(RIndex (n ::: Maybe t) rs)
-- | This is where we assert that a non-nullable field is not 'Nothing'.
nonNullFromJust :: Maybe a -> a
nonNullFromJust = maybe (error "nonNullable was null") id
instance (FieldType n rs ~ Maybe t,
NonNullCtx (n ': ns) rs,
NonNullable ns (UnMaybe n rs))
=> NonNullable (n ': ns) rs where
type NonNull (n ': ns) rs = NonNull ns (UnMaybe n rs)
type NonNullCtx (n ': ns) rs = RecElemAux n (FieldType n rs) rs
nonNullable' = nonNullable' @ns @(UnMaybe n rs)
. (rlens' @(n ::: FieldType n rs) %~ fieldMap nonNullFromJust)
-- | Push a 'Maybe' type constructor in to all fields not specifically
-- asserted to be 'Just'.
nonNullable0 :: forall ns rs. (NonNullable ns (MapTyCon Maybe rs), AllFields rs, RMap rs)
=> Rec (Maybe :. ElField) rs -> Rec ElField (NonNull ns (MapTyCon Maybe rs))
nonNullable0 = nonNullable' @ns . rsequenceInFields
testRec4 :: Rec ElField '["id" ::: Int, "name" ::: [Char], "age" ::: Maybe Int]
testRec4 = nonNullable0 @'["id", "name"] testRec
I'm now pulled in several directions:
nonNullable
be defined?I'm really happy with getting all the machinery into vinyl to make this a relatively concise thing. Concise, that is, except for that type! Woof! The thing is, if you're using nonNullable
on specific records, you don't need to deal with that enormous type signature. You'd typically apply rsequenceInFields
to your Rec Maybe rs
, then apply rlens' @("id" ::: Maybe Int) %~ fieldMap (maybe (error "oops") id)
to however many fields you wanted to make not-Maybe
.
Most of the type signature comes from the use of rlens'
to not only look up a field, but change its type. Some more comes from rsequenceInFields
, and note that you only apply rsequenceInFields
once no matter how many fields you want to make non-null. This means that defining the composition of the two parts is only warranted if the case of a single non-nullable field is vastly more common than any other. If people often want to mark two or three fields as non-nullable, then you may as well use rlens'
directly for each of them for the sake of uniformity, rather than apply rlens'
some number of times to the result of nonNullable
. Or we could go with nonNullable0
....
nonNullable0
is probably going to be a slow-to-compile thing in practice. I don't know how its compilation time compares to manual uses of rlens'
, but it would be interesting to find out. Is this too much?
This is a better generalization. Still probably slow as everything is O(n^2).
{-# LANGUAGE AllowAmbiguousTypes, DataKinds, FlexibleContexts,
FlexibleInstances, MultiParamTypeClasses, PolyKinds,
ScopedTypeVariables, TypeApplications, TypeFamilies,
TypeOperators, UndecidableInstances,
UndecidableSuperClasses #-}
module PushMaybe where
import Control.Monad ((>=>))
import Data.Vinyl
import Data.Vinyl.Functor
import Data.Vinyl.TypeLevel
import Lens.Micro
import GHC.TypeLits (KnownSymbol)
-- The example record we will use for testing
testRec :: Rec (Maybe :. ElField) '["id" ::: Int, "name" ::: String, "age" ::: Int]
testRec = xrec (Just 19, Just "Steve", Nothing)
-- Manually changing the type of id
testRec2 :: Rec ElField '["id" ::: Int, "name" ::: Maybe String, "age" ::: Maybe Int]
testRec2 = rsequenceInFields testRec
& rlens' @("id" ::: Maybe Int) %~ fieldMap fromJust
where fromJust Nothing = error "non-nullable id was null!"
fromJust (Just x) = x
-- Now let's build up the tools we need to do it once and for all
-- | Remove a 'Maybe' type constructor application from the first
-- field with the given name.
type family UnApply f name xs where
UnApply f name ((name ::: f a) ': xs) = (name ::: a) ': xs
UnApply f name (a ': xs) = a ': UnApply f name xs
-- | Push the 'Maybe' interpretation into the field types while
-- asserting that a particular named field is not 'Nothing'. This can
-- fail if the non-nullable field was in fact null!
nonNullable :: forall n rs t.
(t ~ FieldType n rs,
AllFields rs,
RMap rs,
RecElem Rec
(n ::: Maybe t)
(n ::: t)
(MapTyCon Maybe rs)
(UnApply Maybe n (MapTyCon Maybe rs))
(RIndex (n ::: Maybe t) (MapTyCon Maybe rs)))
=> Rec (Maybe :. ElField) rs
-> Rec ElField (UnApply Maybe n (MapTyCon Maybe rs))
nonNullable = (rlens' @(n ::: Maybe t) %~ fieldMap fromJust) . rsequenceInFields
where fromJust Nothing = error "non-nullable id was null!"
fromJust (Just x) = x
testRec3 :: Rec ElField '["id" ::: Int, "name" ::: Maybe String, "age" ::: Maybe Int]
testRec3 = nonNullable @"id" testRec
-- Support lists of non-nullable fields
type family UnApplyAll f ns rs where
UnApplyAll f '[] rs = rs
UnApplyAll f (n ': ns) rs = UnApplyAll f ns (UnApply f n rs)
-- We end up using a 'Monad' constraint because each field we change
-- changes the type of the overall record. So at any given step, we
-- can produce something with a type like `g (Rec ElField rs -> Rec
-- ElField (UnApply g n rs))`, but to arrive at the final type, we
-- need to sequence these things.
-- where r :: g (ElField '(n, a))
-- r = sequenceField (rget @(n ::: FieldType n rs) x)
-- setR :: g (Rec ElField rs -> Rec ElField (UnApply g n rs))
-- setR = fmap (rput' @(n ::: FieldType n rs)) r
-- | Traverse some fields of a 'FieldRec', pulling a type constructor
-- out of the fields.
class TraverseSome g ns rs where
traverseSome :: Monad g
=> Rec ElField rs
-> g (Rec ElField (UnApplyAll g ns rs))
instance TraverseSome g '[] rs where
traverseSome = pure
instance (TraverseSome g ns (UnApply g n rs),
KnownSymbol n, FieldType n rs ~ g a,
RecElem Rec (n ::: g a) '(n, a)
rs (UnApply g n rs)
(RIndex (n ::: g a) rs))
=> TraverseSome g (n ': ns) rs where
traverseSome = traverseOf (rlens' @(n ::: FieldType n rs)) sequenceField
>=> traverseSome @g @ns @(UnApply g n rs)
nonNullable0 :: forall ns rs.
(TraverseSome Maybe ns (MapTyCon Maybe rs),
AllFields rs, RMap rs)
=> Rec (Maybe :. ElField) rs
-> Rec ElField (UnApplyAll Maybe ns (MapTyCon Maybe rs))
nonNullable0 = maybe (error "nonNullable was null") id
. traverseSome @Maybe @ns . rsequenceInFields
testRec4 :: Rec ElField '["id" ::: Int, "name" ::: [Char], "age" ::: Maybe Int]
testRec4 = nonNullable0 @'["id", "name"] testRec
It's getting to the point where traverseSome
is a useful counterpart to rsequenceInFields
, and then nonNullable
is actually a pretty simple definition.
I haven't actually had time to use these versions yet, so I'm not sure my feedback is helpful, but I only had a maximum of two fields per record that were non-nullable. Though, of course, cases where there are more are not infrequent. So if nonNullable
could be applied to the the record N times for N nonnullable fields, that's ok.
The O(n^2)
is a drag; is that incurred once per record where n
is the number of columns? (I have tens of millions of rows to de-nullify on the ingestion side . . . this happens in frequently, but still . . . . )
I haven’t worked out an easy way to speed this up; perhaps going through ARec
would do it. I don’t want this to drag out, though, so I’m inclined to move forward with what’s here, and we can address performance by running things through ARec
whenever someone has time to implement that.
As per a gitter* conversation I had with @acowley the other day, I wrote the following to remove a maybe from a non-nullable field infered in the data as Maybe but I'm not sure how the actual type of column should be changed when extracting the value
fromJust
:and then what I would like to do but am unsure about is
Could anyone advise?
*