agentm / project-m36

Project: M36 Relational Algebra Engine
The Unlicense
876 stars 47 forks source link

can't construct a list that is more than one element. #241

Closed YuMingLiao closed 5 years ago

YuMingLiao commented 5 years ago

Hi, I was trying List in project-m36. I didn't find manuals about Lists. And I tried this on my own but failed.

TutorialD (master/main): data Hair = Bald | Brown | Blond | OtherColor Text
TutorialD (master/main): b :: { a List Hair }
TutorialD (master/main): :showexpr b
┌─────────────────┐
│a::List (a::Hair)│
└─────────────────┘
TutorialD (master/main): b := relation{tuple{ a (Cons Bald Empty) }}
TutorialD (master/main): b := relation{tuple{ a (Cons Bald (Cons Bald Empty)) }}
ERR: NoSuchFunctionError "Bald"
YuMingLiao commented 5 years ago

And a list in a relvar in my program shows like this (just one cons for three elements):

TutorialD (master/main): :showexpr Profile {profileSkills}
┌──────────────────────────────────────────┐
│profileSkills::List (a::Skill)            │
├──────────────────────────────────────────┤
│Cons Woodwork Ironwork Plumber_Electrician│
└──────────────────────────────────────────┘

I don't know if it is normal.

agentm commented 5 years ago

I fixed the list creation bug- luckily, it was just a parsing bug.

Could you provide instructions on how to reproduce the Cons cell with the wrong number of arguments? I see this:

TutorialD (master/main): b := relation{tuple{ a (Cons Bald Bald Empty) }}
ERR: ConstructedAtomArgumentCountMismatchError 2 3

which is the expected behavior, so there must a different way to construct the Profile relvar that I am missing.

YuMingLiao commented 5 years ago

It happens at toAtom in Atomable [a]. Luckily, It is an easy one, too.

λ> data A = A | B | C deriving (Show, Eq, Ord, Generic, NFData, Binary, Atomable)                                                   
λ> toAtom [A,B,C]
ConstructedAtom "Cons" (ConstructedAtomType "List" (fromList [("a",ConstructedAtomType "A" (fromList []))])) [ConstructedAtom "A" (ConstructedAtomType "A" (fromList [])) [],ConstructedAtom "B" (ConstructedAtomType "A" (fromList [])) [],ConstructedAtom "C" (ConstructedAtomType "A" (fromList [])) []]
λ> atomToText it
"Cons A B C"

--Atomable.hs
-- toAtom (x:xs) = ConstructedAtom "Cons" (listAtomType (toAtomType (Proxy :: Proxy a))) (map toAtom (x:xs))
++ toAtom (x:xs) = ConstructedAtom "Cons" (listAtomType (toAtomType (Proxy :: Proxy a))) [toAtom x, toAtom (xs)]

λ> data A = A | B | C deriving (Show, Eq, Ord, Generic, NFData, Binary, Atomable)                                             
λ> toAtom [A,B,C]
ConstructedAtom "Cons" (ConstructedAtomType "List" (fromList [("a",ConstructedAtomType "A" (fromList []))])) [ConstructedAtom "A" (ConstructedAtomType "A" (fromList [])) [],ConstructedAtom "Cons" (ConstructedAtomType "List" (fromList [("a",ConstructedAtomType "A" (fromList []))])) [ConstructedAtom "B" (ConstructedAtomType "A" (fromList [])) [],ConstructedAtom "Cons" (ConstructedAtomType "List" (fromList [("a",ConstructedAtomType "A" (fromList []))])) [ConstructedAtom "C" (ConstructedAtomType "A" (fromList [])) [],ConstructedAtom "Empty" (ConstructedAtomType "List" (fromList [("a",ConstructedAtomType "A" (fromList []))])) []]]]                              
λ> atomToText it
"Cons A (Cons B (Cons C Empty))"
YuMingLiao commented 5 years ago

Thanks for checking!

Could I just ask you some more questions? Why does project-m36 need run-time data type creation? Is it possible to make Set and List primary atom types? I would like to express Skills in Set Skill but not sure why there are no built-in Set and List. Is it violating some relational algebra or 1NF thing?

YuMingLiao commented 5 years ago

And fromAtom also need to change.

λ> data A = A | B | C deriving (Show, Eq, Ord, Generic, NFData, Binary, Atomable)                                                   
λ> toAtom [A,B,C]
ConstructedAtom "Cons" (ConstructedAtomType "List" (fromList [("a",ConstructedAtomType "A" (fromList []))])) [ConstructedAtom "A" (ConstructedAtomType "A" (fromList [])) [],ConstructedAtom "Cons" (ConstructedAtomType "List" (fromList [("a",ConstructedAtomType "A" (fromList []))])) [ConstructedAtom "B" (ConstructedAtomType "A" (fromList [])) [],ConstructedAtom "Cons" (ConstructedAtomType "List" (fromList [("a",ConstructedAtomType "A" (fromList []))])) [ConstructedAtom "C" (ConstructedAtomType "A" (fromList [])) [],ConstructedAtom "Empty" (ConstructedAtomType "List" (fromList [("a",ConstructedAtomType "A" (fromList []))])) []]]]                              
λ> fromAtom it :: [A]
[A,*** Exception: no fromAtomG traversal found
CallStack (from HasCallStack):
  error, called at /root/project-m36/src/lib/ProjectM36/Atomable.hs:46:16 in main:ProjectM36.Atomable 

-- fromAtom (ConstructedAtom "Cons" _ (x:xs)) = fromAtom x:map fromAtom xs
++ fromAtom (ConstructedAtom "Cons" _ (x:y:[])) = fromAtom x : fromAtom y

λ> data A = A | B | C deriving (Show, Eq, Ord, Generic, NFData, Binary, Atomable)                                                   
λ> toAtom [A,B,C]
ConstructedAtom "Cons" (ConstructedAtomType "List" (fromList [("a",ConstructedAtomType "A" (fromList []))])) [ConstructedAtom "A" (ConstructedAtomType "A" (fromList [])) [],ConstructedAtom "Cons" (ConstructedAtomType "List" (fromList [("a",ConstructedAtomType "A" (fromList []))])) [ConstructedAtom "B" (ConstructedAtomType "A" (fromList [])) [],ConstructedAtom "Cons" (ConstructedAtomType "List" (fromList [("a",ConstructedAtomType "A" (fromList []))])) [ConstructedAtom "C" (ConstructedAtomType "A" (fromList [])) [],ConstructedAtom "Empty" (ConstructedAtomType "List" (fromList [("a",ConstructedAtomType "A" (fromList []))])) []]]]                              
λ> fromAtom it :: [A]
[A,B,C]
agentm commented 5 years ago

Ah, thanks- I didn't try using Atomable directly.

Your question regarding why Project:M36 supports algebraic data types is a good one! This topic is specifically covered in the paper "Out of the Tarpit" where ADTs are part of an effort to reduce bugs by allowing developers to program in their assumptions instead of relying on progamming-language specific abstractions. I hope you agree that "Bald | Brown | Bald | OtherColor String" is more specific than "Int" or a simple enumeration. However, the paper recommends not allowing for product types due to concerns about normalization. I wasn't convinced by this argument, so, in the interest of being Haskell-like, I implemented support for product and sum types with the same syntax as Haskell.

Could we implement our data types without ADTs? Sure, just like we do with painful workarounds in SQL or JSON document databases. Furthermore, ADTs solve the pervasive NULL problem.

In conclusion, I would recommend avoiding Lists in a database and embracing ADTs, though Project:M36 does try to make Lists first class objects (instead of CSV strings).

YuMingLiao commented 5 years ago

Thanks for explanation! I agree the "OtherColor String" example is way more specific.

data Skill = A | B | C So if I want to record a person with his/her skill set, then see who have skill A, I guess I should make Data.Set in haskell space Atomable first. Profile{ person Text, skills (Set Skill) }

(If I use relation to represent a set. I can't give it a unique key and the property of set will be lost.)

then create a read-only view that (kind of) ungroup the skill set to avoid repeating information. View{ person Text, skill Skill }

then I can View where skill=A

Does these two things make sense for you in project-m36?

  1. a read-only view just for viewing. Maybe a shortcut operator? ( the := operator creates a new relvar and won't be changed by the original one's update while :@ makes a shortcut for a view and the view's data is changed by the original relvar accordingly.) View :@ Profile ungroup_container skills

  2. a ungroup_container operator ungroup the ADT container instead of subrelation.

-- Another way is to just write boolean function for Set. ^has (A, skills)

Could you give me some opinion about this question? I am scratching my head on this.

agentm commented 5 years ago

Well, with regards to normalization, I would suggest neither a list nor a set for this case. Instead, make use of the relational algebra. "Skill", itself, can be a key since it must support equality, so one can create a many-to-many relationship between Profiles and Skills. The Skill can definitely be a unique key, but maybe that needs clarification in the documentation.

TutorialD (master/main): data Skill = A | B
TutorialD (master/main): :showexpr relation{tuple{a A},tuple{a B},tuple{a B}}
┌────────┐
│a::Skill│
├────────┤
│A       │
│B       │
└────────┘

The advantage of using the relational algebra to make the relationships is the ease of being able to issue arbitrary queries to answer any question you can ask about the data. In your Set-based strategy above, for example, you would need to implement the Set operators on the atom in order to figure out which profiles include which skills, but the native relational algebra does not require additional atom-based operators.

Another strategy, to which I think you alluded, is to use a nested relation to represent the skills. Project:M36 supports this.

Ultimately, you are suggesting views because many of these representations are isomorphic! That's key to understanding data independence because the database can choose whichever representation it finds most ideal for the scenario or it can choose to represent them all of them in memory or on disk. So, in the end, it shouldn't matter which one of the isomorphic representations you choose- stay tuned for exciting new Project:M36 features on this front!

YuMingLiao commented 5 years ago

I see.

I guess I can turn a Set a into RelationAtom in Atomable. So a Profile with a Set a field type will be a nested relation in Tupeable. Then I can get a many-to-many relationship between Profiles and Skills by ungroup.

Thanks for enlightening!