agentm / project-m36

Project: M36 Relational Algebra Engine
The Unlicense
884 stars 48 forks source link

Nested relations with Tupleable #190

Closed MDeltaX closed 6 years ago

MDeltaX commented 6 years ago

Is it possible to model nested relations with Haskell types? I've tried

data Blog = Blog
  { title :: Text
  , comments :: [Comment] -- and  :: Set Comment
  } deriving (Generic, Tupleable)

data Comment = Comment
  { authorName :: Text
  , comment :: Text
  } deriving (Generic, Tupleable, Show, Eq, Binary, NFData, Atomable)

but [Comment] and Set Comment are converted to ConstructedAtom instead of RelationAtom.

agentm commented 6 years ago

It is possible to use nested relations with Tupleable but not with the generically-derived instance. The issue is that generics don't provide sufficient type information to marshal a RelationAtom back-and-forth.

However, you can easily hardcode the type information yourself into the Tupleable instance- just implement toTuple, fromTuple, and toAttributes.

Historical note: the original Tupleable instance did contain sufficient type information to marshal RelationAtoms, but the additional type information made it a burden on all other types, so I removed it.

agentm commented 6 years ago

Let me know if an example would be helpful. It would probably be useful to others, as well.

MDeltaX commented 6 years ago

An example would be great, you could extend docs/tupleable.markdown with it.

MDeltaX commented 6 years ago

Can't I instead implement Atomable [Comment] and use generics for Tupleable Blog and Tupleable Comment?

agentm commented 6 years ago

@koenigmaximilian , please take a look at the new example.

Let me know if you have any questions.

MDeltaX commented 6 years ago

@agentm Thank you!

YuMingLiao commented 5 years ago

hi @agentm I am looking generic ways to make nested relations too.

Historical note: the original Tupleable instance did contain sufficient type information to marshal RelationAtoms, but the additional type information made it a burden on all other types, so I removed it.

What if typeclass Atomable turns a record type (whose fields have selector names) into a RelationAtom. In this way, typeclass Tupleable will generically turn a record field with Atomable typeclass into a RelationAtom.

It's intuitive to me since a type whose fields with selector names kind of matches a relation with attribute names. ConstructedAtom seems lose some information about selector names.

I would like to hear some opinion of yours. What do you think about this idea? Is it reasonable to make it default/generic?

What I can think of is ... one of (very odd) case is when two record types makes each other its own field. In this way there will be infinite RelationAtoms. So do ConstructedAtoms. But it seems still worth to generically model nested relations from Haskell types.

YuMingLiao commented 5 years ago

I think I wasn't thinking it through.

A RelationAtom is a Relation that can has many tuples. However, a field of record type can only be a relation with only one tuple. A RelationAtom is more like a field of a list of record type in haskell rather than a field of record type. An Atom is more like a sum type. And a Relation is more like a product type. RelationAtom seems used in group senario.

I guess I'll go with the custom Tupleable example.

agentm commented 5 years ago

Indeed, the nesting is tricky to represent generically. There are a few options:

None of the options are particularly appealing, which I why I simply chose to force users to implement their own Tupleable instance for this unusual case.

Please let me know if you gain any further insight into how this could be represented better.

YuMingLiao commented 5 years ago

Thanks for the advice! After further thinking, I come to this conclusion.

My scenario is storing DbRecord a of project-m36-typed in project-m36. It saves additional information about a haskell value/ a m36 tuple. I can:

  1. make it a one big, flat relation.
  2. make it two relations. The record one has a foreign key for the entity one's identifier. I guess option 2 is better, formal way to use relational algebra to implement a database. Since it use foreign key and join.

Nested relations seems more like statistics report, which needs aggregation operation, afterwards. So I think, too, it is not necessary to implement a haskell value-to-nested relation internally.