Open codygman opened 7 years ago
In the context of the MissingData
demo,
let r = 2 &: "hey" &: Nil :: Row
What would the type be if I wanted to make a new row from those two that
appends them together resulting in Record '[CombinedText]
, meaning the
value would be 2hey
.
My use case is that I have a Record and then needed to combine some fields into a field that exists in a different Record I've defined with tabletypes.
On Dec 14, 2016 2:38 PM, "Anthony Cowley" notifications@github.com wrote:
In the context of the MissingData demo,
let r = 2 &: "hey" &: Nil :: Row
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/acowley/Frames/issues/73#issuecomment-267149593, or mute the thread https://github.com/notifications/unsubscribe-auth/AANyN_gm_HYVTLnzqkEy6uGzKWeViv9kks5rIFO8gaJpZM4LNU6N .
It all depends on what you want it to do. The most basic thing is that you would create a Record '[Text]
: there's no way to guess what you might want to label the combined column. If you do want a labeled column (using the Col
type as usual), you would need an annotation somewhere to drive the type inference. But since we're down to a single field here, I feel like maybe we're too far from your actual needs.
Focusing on the last thing you asked, I would expect that to be some lensy thing, otherRecord & colCombined .~ (T.pack (show (row^.colA)) <> row^.colB)
(assuming colCombined
is the lens for the destination column in the second record).
A more detailed example is as follows
3 data sources, 1 normalized and 2 denormalized.
compositeKey
"123"
"456"
col_1a,col_2a,col_3a
1,2,3
4,5,6
col_1a,col_2a,col_10a
1,2,3
4,5,6
Generated types (respectively):
type Normalized =
Record '["composite_key" :-> Text, ]
type ValidDenormalized =
Record '["col_1" :-> Int, "col_2" :-> Int, "col_3" :-> Int]
type InvalidDenormalized =
Record '["col_1" :-> Int, "col_2" :-> Int, "col_5555" :-> Int]
I'd like a function that will create the composite key but has the constraints that the input record must contain col_1
, col_2
, and col_3
and that the output must contain composite_key
.
According to the above I should be able to:
createCompositeKey :: (Col1 ∈ rs, Col2 ∈ rs, Col3 ∈ rs, CompositeKey ∈ rs2) =>
Record rs -> Record rs2
createCompositeKey r = compositeTxt &: Nil
where compositeTxt = (T.pack . show . view col1 $ r) <>
(T.pack . show . view col2 $ r) <>
(T.pack . show . view col3 $ r)
The point being that createCompositeKey would give a compile time error if I tried to use it with InvalidDenormalized
.
Here is my full attempt and self-contained reproduction:
{-# LANGUAGE RankNTypes #-}
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE DataKinds #-}
module Test where
import Frames
import Data.Vinyl.TypeLevel (RIndex)
import Data.Typeable (Typeable, showsTypeRep, typeRep)
import Data.Proxy
type Normalized =
Record '["composite_key" :-> Text ]
type ValidDenormalized =
Record '["col_1" :-> Int, "col_2" :-> Int, "col_3" :-> Int]
type InvalidDenormalized =
Record '["col_1" :-> Int, "col_2" :-> Int, "col_5555" :-> Int]
-- createCompositeKey :: (Col1 ∈ rs, Col2 ∈ rs, Col3 ∈ rs, CompositeKey ∈ rs2) =>
-- Record rs -> Record rs2
-- createCompositeKey r = (compositeTxt :: Text) &: Nil
-- where compositeTxt = (T.pack . show . view col1 $ r) <>
-- (T.pack . show . view col2 $ r) <>
-- (T.pack . show . view col3 $ r)
{- error:
src/Frames/Test.hs:37:24: Couldn't match type ‘rs2’ with ‘'[s0 :-> Text]’ …
‘rs2’ is a rigid type variable bound by
the type signature for
createCompositeKey :: (Col1 ∈ rs, Col2 ∈ rs, Col3 ∈ rs,
CompositeKey ∈ rs2) =>
Record rs -> Record rs2
at /home/cody/source/Frames/src/Frames/Test.hs:35:23
Expected type: Record rs2
Actual type: Record '[s0 :-> Text]
Relevant bindings include
createCompositeKey :: Record rs -> Record rs2
(bound at /home/cody/source/Frames/src/Frames/Test.hs:37:1)
In the expression: (compositeTxt :: Text) &: Nil
In an equation for ‘createCompositeKey’:
createCompositeKey r
= (compositeTxt :: Text) &: Nil
where
compositeTxt
= (T.pack . show . view col1 $ r)
<>
(T.pack . show . view col2 $ r) <> (T.pack . show . view col3 $ r)
Compilation failed.
-}
-- definitions for Type synomyns and lenses adapted from end of tutorial
type CompositeKey = "composite_key" :-> Int
type Col1 = "col_1" :-> Int
type Col2 = "col_2" :-> Int
type Col3 = "col_3" :-> Int
type Col5555 = "col_5555" :-> Int
compositeKey ::
forall f_adkB rs_adkC. (Functor f_adkB,
RElem CompositeKey rs_adkC (RIndex CompositeKey rs_adkC)) =>
(Int -> f_adkB Int) -> Record rs_adkC -> f_adkB (Record rs_adkC)
compositeKey = Frames.rlens (Proxy :: Proxy CompositeKey)
col1 ::
forall f_adkB rs_adkC. (Functor f_adkB,
RElem Col1 rs_adkC (RIndex Col1 rs_adkC)) =>
(Int -> f_adkB Int) -> Record rs_adkC -> f_adkB (Record rs_adkC)
col1 = Frames.rlens (Proxy :: Proxy Col1)
col2 ::
forall f_adkB rs_adkC. (Functor f_adkB,
RElem Col2 rs_adkC (RIndex Col2 rs_adkC)) =>
(Int -> f_adkB Int) -> Record rs_adkC -> f_adkB (Record rs_adkC)
col2 = Frames.rlens (Proxy :: Proxy Col2)
col3 ::
forall f_adkB rs_adkC. (Functor f_adkB,
RElem Col3 rs_adkC (RIndex Col3 rs_adkC)) =>
(Int -> f_adkB Int) -> Record rs_adkC -> f_adkB (Record rs_adkC)
col3 = Frames.rlens (Proxy :: Proxy Col3)
col5555 ::
forall f_adkB rs_adkC. (Functor f_adkB,
RElem Col5555 rs_adkC (RIndex Col5555 rs_adkC)) =>
(Int -> f_adkB Int) -> Record rs_adkC -> f_adkB (Record rs_adkC)
col5555 = Frames.rlens (Proxy :: Proxy Col5555)
I'll probably later put all of those composite keys in a map and then filter various producers based on it to find differences between normalized and denormalized data.
I think I see now, thanks. The issue is with this type,
createCompositeKey :: (Col1 ∈ rs, Col2 ∈ rs, Col3 ∈ rs, CompositeKey ∈ rs2) => Record rs -> Record rs2
The problem is that _ &: Nil
can only have a type Record '[t]
for some t
compatible with the value you give, and there is nothing to say that rs2 ~ '[t]
. What you need is a fully realized value of type Record rs2
. I think the best way is to take one as an argument,
createCompositeKey :: (Col1 ∈ rs, Col2 ∈ rs, Col3 ∈ rs, CompositeKey ∈ rs2) => Record rs -> Record rs2 -> Record rs2
createCompositeKey = compositeKey .~ _
(I left a hole there for the text appending business of combining the elements of rs
you want.)
But you could also gin up a new Rec Maybe rs2
or something, set that value's compositeKey
to Just
the value you want, and return that for the caller to merge with other data down the line. The point in taking it as an argument to createCompositeKey
is that the caller can decide if Rec Maybe rs2
makes sense, or if you already have a Record rs2
you want to modify, or if you want a different functor, e.g. Rec First rs2
.
What you need is a fully realized value of type Record rs2. I think the best way is to take one as an argument,
This is starting to sound like a chicken and egg problem ;)
Where will the full realized rs2 come from? How do I create a fully realized rs2? It's not something I'm getting from an external source, but something I'm creating.
On Dec 14, 2016 6:31 PM, "Anthony Cowley" notifications@github.com wrote:
I think I see now, thanks. The issue is with this type,
createCompositeKey :: (Col1 ∈ rs, Col2 ∈ rs, Col3 ∈ rs, CompositeKey ∈ rs2) => Record rs -> Record rs2
The problem is that _ &: Nil can only have a type Record '[t] for some t compatible with the value you give, and there is nothing to say that rs2 ~ '[t]. What you need is a fully realized value of type Record rs2. I think the best way is to take one as an argument,
createCompositeKey :: (Col1 ∈ rs, Col2 ∈ rs, Col3 ∈ rs, CompositeKey ∈ rs2) => Record rs -> Record rs2 -> Record rs2 createCompositeKey = compositeKey .~ _
(I left a hole there for the text appending business of combining the elements of rs you want.)
But you could also gin up a new Rec Maybe rs2 or something, set that value's compositeKey to Just the value you want, and return that for the caller to merge with other data down the line. The point in taking it as an argument to createCompositeKey is that the caller can decide if Rec Maybe rs2 makes sense, or if you already have a Record rs2 you want to modify, or if you want a different functor, e.g. Rec First rs2.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/acowley/Frames/issues/73#issuecomment-267200577, or mute the thread https://github.com/notifications/unsubscribe-auth/AANyNzHnObTyIm8rIy4izqzxT9vosdUiks5rIIpUgaJpZM4LNU6N .
You can always make a Rec Maybe rs2
as I suggested. I think mempty
would do so.
What's happening is that you seem to want to incrementally construct the record, which means you would need to be able to talk about partially constructed records, which is one way of interpreting a Rec Maybe
. The alternative is to compute all your fields individually and build your record using &:
to either create your desired record in one shot or to incrementally build up to your rs2
set of fields one field at a time, bearing in mind that those intermediate records will not have a set of fields equal to rs2
.
Thanks, I'll try out the recMaybe approach as you suggested. I also thank you for deepening my understanding.
On Dec 14, 2016 8:23 PM, "Anthony Cowley" notifications@github.com wrote:
You can always make a Rec Maybe rs2 as I suggested. I think mempty would do so.
What's happening is that you seem to want to incrementally construct the record, which means you would need to be able to talk about partially constructed records, which is one way of interpreting a Rec Maybe. The alternative is to compute all your fields individually and build your record using &: to either create your desired record in one shot or to incrementally build up to your rs2 set of fields one field at a time, bearing in mind that those intermediate records will not have a set of fields equal to rs2.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/acowley/Frames/issues/73#issuecomment-267218956, or mute the thread https://github.com/notifications/unsubscribe-auth/AANyN0OmATu-gOKn5VaR2-cMBhvh7siLks5rIKSTgaJpZM4LNU6N .
Just to clarify, createCompositeKey' :: forall rs rs2. (Col1 ∈ rs, Col2 ∈ rs, Col3 ∈ rs, CompositeKey ∈ rs2) => Record rs -> Rec Maybe rs2
should be possible then?
If so, I'm confused at why this still has an issue:
mkMaybeCompositeKey :: (CompositeKey ∈ rs) => Rec Maybe rs
mkMaybeCompositeKey = Just "compositekey" :& Nil
Then I read your comment about mempty working and tried this:
mkMaybeCompositeKey :: (CompositeKey ∈ rs) => Rec Maybe rs
mkMaybeCompositeKey = mempty
I'm clearly not applying it right. Before I submit this I'll re-read what you wrote above to make sure I'm not wasting your time. After re-reading I tried a lens based approach:
createCompositeKey :: (Col1 ∈ rs, Col2 ∈ rs, Col3 ∈ rs, CompositeKey ∈ rs2) => Record rs -> Rec Maybe rs2
createCompositeKey r = r & compositeKey ?~ (Just compositeTxt)
where compositeTxt = (T.pack . show . view col1 $ r) <>
(T.pack . show . view col2 $ r) <>
(T.pack . show . view col3 $ r)
-- src/Frames/Test.hs:70:28: Couldn't match type ‘Data.Vinyl.Functor.Identity’ with ‘Maybe’ …
-- Expected type: ASetter
-- (Record rs) (Rec Maybe rs2) Text (Maybe (Maybe Text))
-- Actual type: (Text -> Identity Text)
-- -> Record rs2 -> Identity (Record rs2)
-- In the first argument of ‘(?~)’, namely ‘compositeKey’
-- In the second argument of ‘(&)’, namely
-- ‘compositeKey ?~ (Just compositeTxt)’
-- Compilation failed.
I'm starting to think the other alternative you mentioned (merging these columns in my producer) would be worlds easier since it's so difficult(impossible?) to construct something that satisfies CompositeKey ∈ rs2
. I still don't quite understand why that doesn't work, but I was very much hoping CompositeKey ∈ rs2)
meant "something named composite_key' that is of type Text". My feeling now is it does mean that, except you can't create a Record that happens to have a column named composite_key and happens to have a value of type Text and have it satisfy CompositeKey ∈ rs2)
.
Well, off to bed.
Okay, sorry for the confusion, I'm at a computer now. What you want is,
> rpure Nothing & compositeKey' ?~ Col "my voice is my passport" :: Rec Maybe (RecordColumns Row2)
{Just compositeKey :-> "my voice is my passport", Nothing}
(Row2
there is just a dummy table I set up with two columns, one of which is compositeKey
.)
It's rpure Nothing
to build the Rec Maybe rs2
, and then the primed lens is needed because the unprimed one is fixed to the functor being Identity
. A knock on from that is the need to explicitly use the Col
newtype constructor because the unprimed lens doesn't bake that in, either.
The problem with your last attempt is that you have an argument, r
, of type Record rs
, but you then use it to create the return value which is supposed to have type, Rec Maybe rs2
. So replace that first line of your definition with rpure Nothing & compositeKey ?~ Col compositeTxt
and you should be cooking!
@acowley I think I'm closer, but that didn't quite work. This is rpure
from import Data.Vinyl (RecApplicative(rpure))
right?
createCompositeKey :: (Col1 ∈ rs, Col2 ∈ rs, Col3 ∈ rs, CompositeKey ∈ rs2) => Record rs -> Rec Maybe rs2
createCompositeKey r = rpure Nothing & compositeKey ?~ Col compositeTxt
where compositeTxt = (T.pack . show . view col1 $ r) <>
(T.pack . show . view col2 $ r) <>
(T.pack . show . view col3 $ r)
-- error
-- src/Frames/Test.hs:79:30: Couldn't match type ‘Data.Vinyl.Functor.Identity’ with ‘Maybe’ …
-- Expected type: Data.Vinyl.Functor.Identity x
-- Actual type: Maybe x
-- In the first argument of ‘rpure’, namely ‘Nothing’
-- In the first argument of ‘(&)’, namely ‘rpure Nothing’
-- src/Frames/Test.hs:79:40: Couldn't match type ‘Data.Vinyl.Functor.Identity’ with ‘Maybe’ …
-- Expected type: ASetter
-- (Rec Data.Vinyl.Functor.Identity rs2)
-- (Rec Maybe rs2)
-- Text
-- (Maybe (s0 :-> Text))
-- Actual type: (Text -> Identity Text)
-- -> Record rs2 -> Identity (Record rs2)
-- In the first argument of ‘(?~)’, namely ‘compositeKey’
-- In the second argument of ‘(&)’, namely
-- ‘compositeKey ?~ Col compositeTxt’
-- Compilation failed.
That is the first issue I referred to when explaining why the primed lens, compositeKey'
, is needed. You can see in the error message that the "Actual type" talks about Record
rather than Rec
.
We generate two lenses for convenience (and, unfortunately, confusion): compositeKey
has types nailed down to minimize syntactic clutter for the common case of working with the Identity
functor and Col
types. The template haskell also generates compositeKey'
that is the raw vinyl
lens into the record field. In the latter case, the field type could be anything (so we need to use the Col
constructor) and the functor could be anything, which is what lets us use Just
.
Ah, I was confused because compositeKey' want generated since I was working from the above and didn't define it.
I do see what you mean now.
On Dec 15, 2016 10:15 AM, "Anthony Cowley" notifications@github.com wrote:
That is the first issue I referred to when explaining why the primed lens, compositeKey', is needed. You can see in the error message that the "Actual type" talks about Record rather than Rec.
We generate two lenses for convenience (and, unfortunately, confusion): compositeKey has types nailed down to minimize syntactic clutter for the common case of working with the Identity functor and Col types. The template haskell also generates compositeKey' that is the raw vinyl lens into the record field. In the latter case, the field type could be anything (so we need to use the Col constructor) and the functor could be anything, which is what lets us use Just.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/acowley/Frames/issues/73#issuecomment-267368933, or mute the thread https://github.com/notifications/unsubscribe-auth/AANyNz57ZkMhkc58HM4nKu3vWZtAKZRoks5rIWeSgaJpZM4LNU6N .
Now I have a CompositeKey ∈ rs2 => Rec Maybe rs2
and tried:
f :: CompositeKey ∈ rs2 => Rec Maybe rs2 -> Record rs2
f = fromMaybe . recMaybe
However I ran into the same problem of not being able to satisfy the CompositeKey ∈ cs0
constraint. Do I need to sequence these and discard the nothing values somehow first? Perhaps I need to use the First functor for some reason?
I'll put my full code example here later, but to add detail for now I'm trying to use createCompositeKey in a pipe like denormalizedSource >-> P.take 5 >-> createCompositeKey >-> P.print
.
Is cs0
the same as rs2
? You need to have something concrete for that type variable at some point in your program. In typical Frames
usage, the type synonym generated for the row type is a concrete type that shows up a lot and is used to pin down the types outside of library code that is polymorphic with respect to the full row type. In my example above, you see that I had a Row2
type. In that case, I created a CSV file with the columns I wanted and let Frames
generate things because I didn't remember how to do it manually. If this destination row type of yours is entirely for output purposes, then you will need to define it in code.
If createCompositeKey
is specifically intended to be polymorphic in the row type, then your debugging code will just need a little nudge to use your destination type. E.g. replacing P.print
with P.print . id @Row2
(using TypeApplications
).
Edit: I checked the docs, and it's declareColumn
that would be used to define things manually.
cs0 and rs2 are not the same and I'm not sure why ghc is looking for cs0 rather than the rs2 constraint I provided.
I created a CSV file with the columns I wanted and let Frames generate things because I didn't remember how to do it manually
I don't have time to post full code or go into more detail now, but my intuition or pessimism tells me I need to know how to do the thing you forgot how to do :)
On Dec 15, 2016 12:45 PM, "Anthony Cowley" notifications@github.com wrote:
Is cs0 the same as rs2? You need to have something concrete for that type variable at some point in your program. In typical Frames usage, the type synonym generated for the row type is a concrete type that shows up a lot and is used to pin down the types outside of library code that is polymorphic with respect to the full row type. In my example above, you see that I had a Row2 type. In that case, I created a CSV file with the columns I wanted and let Frames generate things because I didn't remember how to do it manually. If this destination row type of yours is entirely for output purposes, then you will need to define it in code.
If createCompositeKey is specifically intended to be polymorphic in the row type, then your debugging code will just need a little nudge to use your destination type. E.g. replacing P.print with P.print . id @Row2 (using TypeApplications).
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/acowley/Frames/issues/73#issuecomment-267407626, or mute the thread https://github.com/notifications/unsubscribe-auth/AANyNzGYpop1UHjAXZfPE3iTBfiyN5FKks5rIYrigaJpZM4LNU6N .
Doing it when you don't have a CSV to generate Haskell from goes like this,
declareColumn "compositeKey" ''Text
declareColumn "foobar" ''Bool
type Row2 = Record '[CompositeKey, Foobar]
test :: Rec Maybe (RecordColumns Row2)
test = rpure Nothing & compositeKey' ?~ Col "my voice is my passport"
So @acowley ... I finally accomplished what I was trying to do:
composite_key,tag
"1234","one"
"5678","two"
"9101","three"
"1121","four"
key_a,key_b,key_c,key_d,tag
1,2,3,4,"one"
9,1,0,1,"three"
1,1,2,1,"four"
The goal being to first create a composite key column in denormalized.csv and then find out which row is missing in denormalized.csv.
Here is the full working code:
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE KindSignatures #-}
{-# LANGUAGE RankNTypes #-}
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE TemplateHaskell #-}
module Test where
import Frames hiding ((:&))
import Frames.CSV (tableTypesOverride, RowGen(..), rowGen, readTableOpt)
import Frames.InCore (RecVec)
import Data.Vinyl (Rec(RNil), RecApplicative(rpure))
import Data.Vinyl.TypeLevel (RIndex)
import Control.Lens (view, (&), (?~))
import Pipes (Producer, (>->), runEffect)
import Pipes.Internal (Proxy)
import Data.Monoid ((<>))
import Data.Maybe (fromMaybe, isNothing)
import qualified Control.Foldl as L
import qualified Pipes.Prelude as P
import qualified Data.Text as T
import qualified Data.Text.Lazy as LT
import qualified Data.Map as M
import qualified Data.Text.Format as T
tableTypesOverride rowGen {rowTypeName ="Normalized"} "normalized.csv" [("composite_key", ''Text)]
tableTypesOverride rowGen {rowTypeName ="Denormalized"} "denormalized.csv" []
mkCompositeKey :: (RecApplicative rs2, KeyA ∈ rs, KeyB ∈ rs, KeyC ∈ rs, KeyD ∈ rs, CompositeKey ∈ rs2) =>
Record rs -> Record rs2
mkCompositeKey denormRow = fromMaybe (error "failed to make composite key") . recMaybe $ newRow
where newRow = rpure Nothing & compositeKey' ?~ Col compositeKeyTxt
compositeKeyTxt = view' keyA <> view' keyB <> view' keyC <> view' keyD
view' l = LT.toStrict $ T.format "{}" (T.Only (view l denormRow))
intToTxt = T.pack . show :: Int -> Text -- slow!
normalized :: Producer Normalized IO ()
normalized = readTableOpt normalizedParser "normalized.csv"
-- λ> runEffect $ normalized >-> P.take 5 >-> P.print
-- {composite_key :-> "1234", tag :-> "one"}
-- {composite_key :-> "5678", tag :-> "two"}
-- {composite_key :-> "9101", tag :-> "three"}
denormalized :: Producer Denormalized IO ()
denormalized = readTableOpt denormalizedParser "denormalized.csv"
-- λ> runEffect $ denormalized >-> P.take 5 >-> P.print
-- {key_a :-> 1, key_b :-> 2, key_c :-> 3, key_d :-> 4, tag :-> "one"}
-- {key_a :-> 5, key_b :-> 6, key_c :-> 7, key_d :-> 8, tag :-> "two"}
-- {key_a :-> 9, key_b :-> 1, key_c :-> 0, key_d :-> 1, tag :-> "three"}
-- denormalized' is actually the "normalized" denormalized in that it contains the same compositeKey. I was lazy and didn't care about keeping the tag.
denormalized' :: Producer (Record '[CompositeKey]) IO ()
denormalized' = denormalized >-> P.map mkCompositeKey
-- λ> runEffect $ denormalized' >-> P.take 5 >-> P.print
-- {composite_key :-> "1234"}
-- {composite_key :-> "5678"}
-- {composite_key :-> "9101"}
addCompositeKeyToMapFold :: (Num a, CompositeKey ∈ rs) =>
L.Fold (Record rs) (M.Map Text a)
addCompositeKeyToMapFold = L.Fold (\m r -> addCompositeKeyToMap m r) (M.empty) id
addCompositeKeyToMap
:: (Num a, CompositeKey ∈ rs) =>
M.Map Text a -> Record rs -> M.Map Text a
addCompositeKeyToMap m r = M.insert (view compositeKey r) 0 m
buildCompositeKeyMap :: (Foldable f, CompositeKey ∈ rs) =>
f (Record rs) -> M.Map Text Integer
buildCompositeKeyMap = L.fold addCompositeKeyToMapFold
findMissingRows :: forall
(m :: * -> *)
(rs :: [*])
a'
a
(m1 :: * -> *)
r
(rs1 :: [*]).
( Monad m1,
CompositeKey ∈ rs,
CompositeKey ∈ rs1,
L.PrimMonad m,
Frames.InCore.RecVec rs
) =>
Pipes.Internal.Proxy a' a () (Record rs1) m1 r
-> Producer (Record rs) m ()
-> m (Pipes.Internal.Proxy a' a () (Record rs1) m1 r)
findMissingRows referenceProducer checkProducer = do
-- build the index of rows in the producer to check
compositeKeyMap <- buildCompositeKeyMap <$> inCoreAoS checkProducer
-- we could have built compositeKeyMap in a single line if we were golfing
-- L.fold (L.Fold (\m r -> M.insert (view compositeKey r) 0 m) (M.empty) id) <$> inCoreAoS checkProducer
-- keep only rows we didn't have in the checkProducer index produced
return $ referenceProducer >-> P.filter (\r -> isNothing $ M.lookup (view compositeKey r) compositeKeyMap)
printMissingRows :: IO ()
printMissingRows = do
putStrLn "rows normalized contains deonrmalized does not"
findMissingRows normalized denormalized' >>= \p -> runEffect $ p >-> P.print
I have a few questions:
compositeKey r
and the compositeKeyMap
's keys. Is there a way to tell ghc that the two pipes producers columns which are compared must be equal, without having to go into specifics?
-Would you have done this differently?First and foremost: congratulations! 🎉 😀
That type signature is certainly an eyesore, it looks like something GHCi would say. It's late here, so this might not be quite right, but I'll give it a quick shot.
Oh, I also don't think you benefit from inCoreAoS
here as that is primarily for situations where you need to perform multiple passes over the data. This may require a change to the fold things that build the Map
, but Pipes.Prelude.fold
might do the job.
In that case, I'd expect it to go something like this,
findMissingRows :: (Monad m, CompositeKey ∈ rs, CompositeKey ∈ rs1)
=> Producer (Record rs) m ()
-> m (Pipe (Record rs1) (Record rs1) m ())
findMissingRows checkProducer = do
-- build the index of rows in the producer to check
compositeKeyMap <- buildCompositeKeyMap checkProducer
-- keep only rows we didn't have in the checkProducer index produced
return $ P.filter (\r -> M.notMember (view compositeKey r) compositeKeyMap)
Then you do the plumbing elsewhere,
printMissingRows :: IO ()
printMissingRows = do
putStrLn "rows normalized contains deonrmalized does not"
findMissing <- findMissingRows denormalized
runEffect $ normalized >-> findMissing >-> P.print
In general you usually want to make an effort to separate the gluing together of pipes/conduits/machines from the definitions of the pieces. This has the nice effect of making it relatively straightforward to read off the data flow of the program by looking at the top-level definition.
Thanks! This one was (and is) a little tough ;)
Looks like I wasn't quite done...
If I change the producers to use the defaulting ones:
normalizedDefaulted :: Producer Normalized IO ()
normalizedDefaulted = readTableMaybe "normalized.csv" >-> P.map (fromJust . holeFiller)
where holeFiller :: Rec Maybe (RecordColumns Normalized) -> Maybe Normalized
holeFiller = recMaybe
. rmap getFirst
. rapply (rmap (Lift . flip (<>)) def)
. rmap First
fromJust = fromMaybe (error "normalizedDefaulted failure")
denormalizedDefaulted :: Producer Denormalized IO ()
denormalizedDefaulted = readTableMaybe "denormalized.csv" >-> P.map (fromJust . holeFiller)
where holeFiller :: Rec Maybe (RecordColumns Denormalized) -> Maybe Denormalized
holeFiller = recMaybe
. rmap getFirst
. rapply (rmap (Lift . flip (<>)) def)
. rmap First
fromJust = fromMaybe (error "denormalizedDefaulted failure")
-- denormalized' is actually the "normalized" denormalized in that it contains the same compositeKey. I was lazy and didn't care about keeping the tag.
denormalizedDefaulted' :: Producer (Record '[CompositeKey]) IO ()
denormalizedDefaulted' = denormalizedDefaulted >-> P.map mkCompositeKey
I get the following type constraint error:
src/Frames/Test.hs:145:3: error: …
• No instance for (RElem
("composite_key" :-> Text)
'[]
(RIndex ("composite_key" :-> Text) '[]))
arising from a use of ‘findMissingRows’
• In the first argument of ‘(>>=)’, namely
‘findMissingRows normalizedDefaulted denormalizedDefaulted’
In a stmt of a 'do' block:
findMissingRows normalizedDefaulted denormalizedDefaulted
>>= \ p -> runEffect $ p >-> P.print
In the expression:
do { putStrLn "rows normalized contains deonrmalized does not";
findMissingRows normalizedDefaulted denormalizedDefaulted
>>= \ p -> runEffect $ p >-> P.print }
Compilation failed.
I'll try taking advantage of pipelines more and see if that fixes the inference problem as well.
It seems that changing the first line of denormalizedDefaulted
as well as the type signature fixed my issues:
denormalizedDefaulted :: Producer (Record '[CompositeKey]) IO ()
denormalizedDefaulted = readTableMaybe "denormalized.csv" >-> P.map (fromJust . holeFiller) >-> P.map mkCompositeKey
I loosely understand why that worked I think.
Yes! That's the issue we talked about earlier. Feeding something with polymorphic output into P.print
is no good because P.print
won't fix the types involved. Luckily, all that is needed is to provide a type that satisfies CompositeKey ∈ rs
which '[CompositeKey]
certainly does.
Funnily enough, my actual problem was that when I translated this example over to my use case I was mixing up the normalize and denormalized sources.
My error came from calling buildCompositeKey on Normalized which doesn't contain Key1, 2, or, 3.
I was so sure that this was a type inference error I overlooked the possibility that frames was doing what it was meant to ;)
A big part of that is the error messages being off-putting. Once you start breaking it down, they're not actually that bad. The problem is that, for me at least, there is so much text that I don't want to read it. I think Idris-mode does a good job in this area with collapsible error messages.
Now I'm trying to update the above to append the CompositeKey column rather than create a record of only CompositeKey. I've been using this issue for reference, but I'm stuck with this:
f :: ( RecApplicative rs2
, KeyA ∈ rs
, KeyB ∈ rs
, KeyC ∈ rs
, KeyD ∈ rs
, CompositeKey ∈ rs2
) => Record rs -> Record (CompositeKey ': RecordColumns Denormalized)
f inRecord = frameConsA compositeKey inRecord
where compositeKeyTxt :: Text
compositeKeyTxt = view compositeKey (mkCompositeKey inRecord)
-- type error:
-- src/Main.hs:129:14: error: …
-- • Couldn't match type ‘(Text -> f0 Text)
-- -> Record rs1 -> f0 (Record rs1)’
-- with ‘Text’
-- Expected type: Record (CompositeKey : RecordColumns Denormalized)
-- Actual type: Rec
-- Identity
-- ("composite_key"
-- :-> ((Text -> f0 Text) -> Record rs1 -> f0 (Record rs1))
-- : rs)
-- • In the expression: frameConsA compositeKey inRecord
-- In an equation for ‘f’:
-- f inRecord
-- = frameConsA compositeKey inRecord
-- where
-- compositeKeyTxt :: Text
-- compositeKeyTxt = view compositeKey (mkCompositeKey inRecord)
-- Compilation failed.
Shouldn't that be compositeKeyTxt
in the frameConsA
application?
@acowley Oh dear, I think you are right :D
I still have an error despite fixing that particular issue:
f :: ( RecApplicative rs2
, KeyA ∈ rs
, KeyB ∈ rs
, KeyC ∈ rs
, KeyD ∈ rs
, CompositeKey ∈ rs2
) => Record rs -> Record (CompositeKey ': RecordColumns Denormalized)
f inRecord = frameConsA ("compositeKeysText" :: Text) inRecord
-- type error:
-- src/Main.hs:129:14: error: …
-- • Couldn't match type ‘rs’
-- with ‘'["key_a" :-> Int, "key_b" :-> Int, "key_c" :-> Int,
-- "key_d" :-> Int, "tag" :-> Text, "bdouble" :-> Double,
-- "bbool" :-> Bool]’
-- ‘rs’ is a rigid type variable bound by
-- the type signature for:
-- f :: forall (rs2 :: [*]) (rs :: [*]).
-- (RecApplicative rs2, KeyA ∈ rs, KeyB ∈ rs, KeyC ∈ rs, KeyD ∈ rs,
-- CompositeKey ∈ rs2) =>
-- Record rs -> Record (CompositeKey : RecordColumns Denormalized)
-- at /home/cody/sources/frames-differences-current/src/Main.hs:122:6
-- Expected type: Record (CompositeKey : RecordColumns Denormalized)
-- Actual type: Rec Identity ("composite_key" :-> Text : rs)
-- • In the expression:
-- frameConsA ("compositeKeysText" :: Text) inRecord
-- In an equation for ‘f’:
-- f inRecord = frameConsA ("compositeKeysText" :: Text) inRecord
-- • Relevant bindings include
-- inRecord :: Record rs
-- (bound at /home/cody/sources/frames-differences-current/src/Main.hs:129:3)
-- f :: Record rs
-- -> Record (CompositeKey : RecordColumns Denormalized)
-- (bound at /home/cody/sources/frames-differences-current/src/Main.hs:129:1)
-- Compilation failed.
Isn't the below equal?
(KeyA ∈ rs , KeyB ∈ rs , KeyC ∈ rs , KeyD ∈ rs) => rs == Record (CompositeKey : RecordColumns Denormalized)
I relaxed the constraint a little to just require containing CompositeKey
and got the following:
f :: ( RecApplicative rs2
, KeyA ∈ rs
, KeyB ∈ rs
, KeyC ∈ rs
, KeyD ∈ rs
, CompositeKey ∈ rs2
) => Record rs -> Record rs2
f inRecord = frameConsA ("compositeKeysText" :: Text) inRecord
-- src/Main.hs:129:14: error: …
-- • Couldn't match type ‘rs2’ with ‘s0 :-> Text : rs’
-- ‘rs2’ is a rigid type variable bound by
-- the type signature for:
-- f :: forall (rs2 :: [*]) (rs :: [*]).
-- (RecApplicative rs2, KeyA ∈ rs, KeyB ∈ rs, KeyC ∈ rs, KeyD ∈ rs,
-- CompositeKey ∈ rs2) =>
-- Record rs -> Record rs2
-- at /home/cody/sources/frames-differences-current/src/Main.hs:122:6
-- Expected type: Record rs2
-- Actual type: Rec Identity (s0 :-> Text : rs)
-- • In the expression:
-- frameConsA ("compositeKeysText" :: Text) inRecord
-- In an equation for ‘f’:
-- f inRecord = frameConsA ("compositeKeysText" :: Text) inRecord
-- • Relevant bindings include
-- inRecord :: Record rs
-- (bound at /home/cody/sources/frames-differences-current/src/Main.hs:129:3)
-- f :: Record rs -> Record rs2
-- (bound at /home/cody/sources/frames-differences-current/src/Main.hs:129:1)
-- Compilation failed.
In this example I'm also thinking that rs2
and (s0 :-> Text : rs)
but according to the error, it looks like this is untrue.
You can decode the problem from the error messages.
f :: ( RecApplicative rs2
, KeyA ∈ rs
, KeyB ∈ rs
, KeyC ∈ rs
, KeyD ∈ rs
, CompositeKey ∈ rs2
) => Record rs -> Record (CompositeKey ': RecordColumns Denormalized)
This type says that you will take any list of type rs
that satisfies that constraints and return a specific list CompositeKey ': RecordColumns Denormalized
, but you are building the result by cons'ing the CompositeKey
column onto rs
, which means that you are assuming rs ~ RecordColumns Denormalized
which does not follow from the type.
You can either return Record (CompositeKey ': rs)
or change the argument to Denormalized
.
In your second attempt you are assuming rs2 ~ (CompositeKey ': rs)
, which again does not follow from the type signature.
Note that you are /adding/ a column to something, so it is unlikely that the CompositeKey ∈ _
constraint is ever needed.
In earlier conversations we talked about updating a record that had a CompositeKey
column. In that case, you would need that constraint because you are replacing an existing column. In this more recent effort you are using frameConsA
to add a column to a record, so there is no CompositeKey
constraint that the caller must satisfy for the function to do its work.
Wow finally, Full working version. I got some ideas on how to make it more generic from Applying Type-Level and Generic Programming in Haskell by Andres Löh as well as a better understanding as to how vinyl and frames are implemented.
Thanks for your patience and guidance. Hopefully in return I can clean up this example and add it to the demos directory and perhaps add some other tutorials and/or documentation :)
So now I'm trying to make a generic findMissingRowsOn
. I almost finished but then ran the issue below. I think I need to add a MonadReader _ Identity
constraint on rec2
, but I'm not sure what _
should be in that case.
Is there a way to find out the default instance of MonadReader that view uses from ghci when I'm using it in a more decidable case?
findMissingRowsOn :: ( Monad monad
, Ord key
) =>
Getting key (Rec Identity rec1) key -- lens
-> Producer (Record rec1) monad () -- checkProducer
-> monad (Pipe (Record rec2) (Record rec2) monad ())
findMissingRowsOn lens checkProducer = do
keyMap <- P.fold (\m r -> M.insert (view lens r) 0 m) M.empty id checkProducer
pure $ P.filter (\(r :: Record rec2) -> M.notMember (view lens r) keyMap)
-- type error:
-- src/Main.hs:148:56: error: …
-- • Could not deduce (MonadReader
-- (Rec Identity rec1) ((->) (Record rec2)))
-- arising from a use of ‘view’
-- from the context: (Monad monad, Ord key)
-- bound by the type signature for:
-- findMissingRowsOn :: (Monad monad, Ord key) =>
-- Getting key (Rec Identity rec1) key
-- -> Producer (Record rec1) monad ()
-- -> monad (Pipe (Record rec2) (Record rec2) monad ())
-- at /home/cody/source/frames-differences/src/Main.hs:(140,1)-(145,70)
-- • In the first argument of ‘M.notMember’, namely ‘(view lens r)’
-- In the expression: M.notMember (view lens r) keyMap
-- In the first argument of ‘P.filter’, namely
-- ‘(\ (r :: Record rec2) -> M.notMember (view lens r) keyMap)’
-- Compilation failed.
So the above has to do with the lens being used for rec1
and then attempting to use it on a rec2
when the lens has already specialized on a rec1
. I'm not sure how to let the type system know that the lens can work on either rec2
or rec1
.
I verified this by making a function that takes two lenses and passing in the exact same lens and seeing it typecheck.
You can look at the lenses the Frames
TH generates to see what the types look like without any lens
type synonyms. Passing something with that sort of type, but perhaps using a synonym to make it cleaner, should work.
How can I make a buildRow function of type
Row
? Perhaps withColFun
? Perhaps something totally different? Something is telling me I could do this by distributing Identity across the record with ColFun.