acowley / Frames

Data frames for tabular data.
Other
297 stars 41 forks source link

CoRec as a standalone library? #42

Closed erratic-pattern closed 6 years ago

erratic-pattern commented 9 years ago

Any plans to export this CoRec structure to its own library? or perhaps as part of vinyl? I'm interested in trying to use this for the upcoming vinyl rework of webdriver, but I'd like to only depend on a library specifically tailored to CoRec.

acowley commented 9 years ago

Yeah, let's do it! Any thoughts on name changes, organization, or other breaking changes? cc @jonsterling.

jonsterling commented 9 years ago

Lyniv? :wink:

erratic-pattern commented 9 years ago

I think it would make sense to include it in the vinyl package, but I do like that name. :D

jonsterling commented 9 years ago

@kallisti-dev Oh, I had understood from your initial comment that you wanted it to be separate from Vinyl. I'm fine with it being in the Vinyl package if @acowley thinks thats a good idea.

erratic-pattern commented 9 years ago

Ah yes that was a mistake of wording on my part.

acowley commented 9 years ago

I think it should probably go in Vinyl, too. It's small and specifically designed to be symmetric. I'll do some thinking on possible minor renamings tomorrow, and give anyone else a chance to weigh in on it. Paging also @meiersi and @noinia since both have contributed to that module.

meiersi commented 9 years ago

I'd appreciate such a move. I'll likely need some CoRec's in the future.

Anthony Cowley notifications@github.com schrieb am Di., 20. Okt. 2015 6:02 AM:

I think it should probably go in Vinyl, too. It's small and specifically designed to be symmetric. I'll do some thinking on possible minor renamings tomorrow, and give anyone else a chance to weigh in on it. Paging also @meiersi https://github.com/meiersi and @noinia https://github.com/noinia since both have contributed to that module.

— Reply to this email directly or view it on GitHub https://github.com/acowley/Frames/issues/42#issuecomment-149425309.

noinia commented 9 years ago

I'm fine with moving it to a vinyl as well.

andrewthad commented 8 years ago

If a CoRec type is added to vinyl, it seems like it would make the most sense to define it in a way that actually makes it dual to Rec (not sure if I'm using the word "dual" correctly). Like this:

-- definition of Rec, just for reference
data Rec :: (u -> *) -> [u] -> * where
  RNil :: Rec f '[]
  (:&) :: !(f r) -> !(Rec f rs) -> Rec f (r ': rs)

data CoRec :: (u -> *) -> [u] -> * where
  CRNil :: Void -> CoRec f '[] -- uninhabited, possibly not needed
  CRCons :: Either (f r) (CoRec f rs) -> CoRec f (r ': rs)

Then you could easily define things like:

replace :: CoRec f rs -> Rec f rs -> Rec f rs
replace (CRCons (Left new)) (old :& rs) = new :& rs
replace (CRCons (Right c)) (old :& rs) = r :& replace c rs

Though I assume that there were reasons that CoRec was defined in Frames the way that it was, and maybe this approach gains the neatness of a duality at the cost of something else.

acowley commented 8 years ago

I see where you're coming from with the Either argument to the constructor as a replacement for the curried pair to Rec's cons, and we end up creating evidence like that with the RElem constraint on the existing constructor.

Suppose I want to create a Char variant of the type, CoRec f [Int, Char, Double] for some Applicative f. With the existing approach, we can say, Col (pure 'a'). With the explicit Either approach, we'd write CRCons (Right (CRCons (Left (pure 'a')))) (I think). We'd probably end up using a class to actually create the value to make it simpler.

acowley commented 8 years ago

Any new thoughts on this, @andrewthad? It would be nice to pull this into the other changes going on in Vinyl. I also have a record intersection branch somewhere that could be picked up again, as I suspect you might have some insight into that.

andrewthad commented 8 years ago

As far as CoRec is concerned, I'm in favor of the approach I gave (without CRNil) plus, as you suggested, the addition of a typeclass that makes value construction easier:

data CoRec :: (u -> *) -> [u] -> * where
  CRCons :: Either (f r) (CoRec f rs) -> CoRec f (r ': rs)

class ToCoRec r rs where
  toCoRec :: f r -> CoRec f rs
instance {-# OVERLAPPABLE #-} ToCoRec r rs => ToCoRec r (a ': rs) where
  toCoRec r = CRCons (Right (toCoRec r))
instance {-# OVERLAPPING #-} ToCoRec r (r ': rs) where
  toCoRec r = CRCons (Left r)

I haven't actually testing this yet, but it should provide a way to easily create CoRecs:

foo :: Applicative f => CoRec f '[Int,Char,Bool]
foo = toCoRec (pure True)

I'll try compiling in a little while to see if that actually works.

Also, there is probably a better name for CRCons (since I don't usually see cons mentioned when dealing with sums). I'd like to know what the more correct thing to call this is. @jonsterling What name do you think is appropriate?

And yes, I would definitely have a look at the intersection branch. I'll see if I can find it.

andrewthad commented 8 years ago

I just tested it, and the definition of CoRec and the typeclass work as expected.

andrewthad commented 8 years ago

Actually, I thought about it some more, and I like this equivalent definition better:

data CoRec :: (u -> *) -> [u] -> * where
  CoRecHere :: !(f r) -> CoRec f (r ': rs)
  CoRecThere :: !(CoRec f rs) -> CoRec f (r ': rs)

Again, definitely interested in knowing if anyone has better ideas for how to name the data constructors.

acowley commented 8 years ago

Is this really an improvement? The existing definition factors the generic Elem plumbing into a constraint, while here it is inlined into the CoRec definition. What is the argument for denormalizing the design in this way?

andrewthad commented 8 years ago

It's really interesting how different how different perspectives can be. To me, the CoRec in Frames seems denormalized, precisely because it uses a typeclass constraint. But you see it the other way around.

This is the most empirical reasoning I can come up with: If a version of CoRec goes into vinyl, it should be dual to Rec's definition. The two main reasons are:

  1. The combination of CoRec and a ToCoRec typeclass would mean that you wouldn't lose any expressive power. Any functions that can be written with Frames's CoRec can be written with this alternative. The only difference is that some might have an additional constraint in the type signature. (I'm may have missed something here, so let me know if you think of a function that's unrepresentable with the alternative formulation).
  2. There are functions that can be written with this alternative that cannot be written with Frames's version. The replace function I gave as an example in an earlier comment is an example of this. (If you can think of a way to write it, let me know).
acowley commented 8 years ago

Sure it's possible, it reuses the shared RElem machinery.

replace :: CoRec f rs -> Rec f rs -> Rec f rs
replace (Col x) = rput x
noinia commented 8 years ago

The only difference is that some might have an additional constraint in the type signature. (I'm may have missed something here, so let me know if you think of a function that's unrepresentable with the > alternative formulation).

I haven't really thought about what "the right way" is to define a CorRec type, and how this effects the required constraints on functions. However, I think that "just requiring an additional constraint on some functions" is kind of a big issue. One of the main annoyances I've had with some "advanced" type-level trickery stuff in haskell is that you fairly frequently have to drag around constraints that are "obviously and/or always true". This needlessly complicates stuff. Hence, I think we should be careful introducing (additional?) constraints.

andrewthad commented 8 years ago

@noinia Let me clarify that a little bit. Moving a constraint from outside a GADT constructor to inside it doesn't actually change the requirement of the constraint. It just makes it more convenient to write the type signature (sometimes). When you pass around that GADT, you are implicitly passing around that constraint. With this approach, there would be no additional constraints required (RElem becomes ToCoRec though), but sometimes they would just have to be written out explicitly in a type signature because there is no GADT that hides them.

@acowley Good job noticing that the example I picked was actually a bad example because it's something that RElem specifically lets you recover. Let me try a few others that rely on induction that RElem doesn't capture:

Currently in Frames, CoRec has no Show instance or Eq instance. We can write definitions that mirror those for Rec if CoRec is defined this way:

data CoRec :: (u -> *) -> [u] -> * where
  CoRecHere :: !(f r) -> CoRec f (r ': rs)
  CoRecThere :: !(CoRec f rs) -> CoRec f (r ': rs)

-- I could use the singleton list as the base case instead,
-- but this looks nicer and behaves identically.
instance Eq (CoRec f '[]) where
  _ == _ = error "empty CoRec does not exist"
instance (Eq (CoRec f rs), Eq r) => Eq (CoRec f (r ': rs)) where
  CoRecHere a == CoRecHere b = a == b
  CoRecThere anext == CoRecThere bnext = anext == bnext
  _ == _ = False

The Show instance could be written similarly to the one for Rec (with a RecAll constraint, and that would probably be nice, just for symmetry), but it can also be written like the Eq instance:

instance Show (CoRec f '[]) where
  _ == _ = error "empty CoRec does not exist"
instance (Show (CoRec f rs), Show r) => Eq (CoRec f (r ': rs)) where
  show (CoRecHere a) = "CoRec { " ++ show a ++ " }"
  CoRecThere anext = show anext

Additionally, CoRec could have Ord and Semigroup instances (I don't think that Monoid is possible because mempty really has no good option), but there is no most obvious choice for the correct one. Let's write one possible append function anyway:

-- This is left biased. That is, if the corecords contain different elements,
-- it will choose whichever one has a value first.
appendCoRec :: RecAll f rs Monoid => CoRec f rs -> CoRec f rs -> CoRec f rs
appendCoRec (CoRecHere a) (CoRecHere b) = a <> b
appendCoRec (CoRecHere a) (CoRecThere _) = a
appendCoRec (CoRecThere _) (CoRecHere b) = b
appendCoRec (CoRecThere anext) (CoRecThere bnext) = appendCoRec anext bnext

I'll keep thinking about this more to see if I can come up with anything more compelling at some point.

acowley commented 8 years ago

All of those are actually already defined.

andrewthad commented 8 years ago

How do you mean that? There is a show instance, but it only works on CoRecs whose values are wrapped in a Dict. And I do not see an Eq instance for CoRec. Are you referring to a branch other than master?

acowley commented 8 years ago

Admittedly, I'm in a car so may be misreading everything, but are the instances starting here not what you're referring to?

andrewthad commented 8 years ago

Ah, I did not see those earlier. For some reason, those don't show up on the haddocks. Those are not quite as powerful as the instances I gave because they only work on CoRec Identity instead of the more general CoRec f.

I'm beginning to see how you are able to accomplish things with the RElem constraint. It seems like a lot of functions end up taking a RecApplicative constraint to get a hold of something to do induction on. This is largely but not entirely mitigated with my approach. I say not entirely because of this:

corecToRec :: RecApplicative ts => CoRec f ts -> Rec (Maybe :. f) ts
corecToRec (Col x) = rput (Compose $ Just x) (rpure (Compose Nothing))

If I rewrite this function the inductively defined CoRec, I still need the RecApplicative constraint (or equivalently an additional argument of type Rec proxy ts), which is disappointing.

In conclusion, I still think that the inductive definition of CoRec is the one I would rather work with, but I think that's ultimately just my opinion on the topic. I would rather have the typeclass-based CoRec go into a separate library than into vinyl itself, but I'll be cool with whatever you decide. Thanks for being so patient responding to all my comments.

sboosali commented 6 years ago

CoRec has been merged into vinyl, right? should we close this?