Closed agentm closed 6 years ago
I fixed #203, so the nested, arbitrary relation construction should be able to be coded at line 610 in Base.hs. @YuMingLiao, would you like to finish this work?
I merged the master into my version of your branch in this repository.
Sure! I will finish it. I am working on arbitrary DateTime now. I may need some time to understand git mechanism, ConstructedAtomType and TypeVariable.
Thanks for your support!
It occurred to me that the ConstructedAtomType
Arbitrary
instance will require access to the DatabaseContext
since the AtomType
alone does not contain the list of available value constructors. If you want to pass that instance on to me to finish, that's fine.
Good news! I have managed to create arbitrary ConstructedAtomType
!
TutorialD (master/main): createarbitraryrelation b {a Hair} 5-10
TutorialD (master/main): :showexpr b
┌─────────────┐
│a::Hair │
├─────────────┤
│White │
│White │
│Other "Sunny"│
│Other "Mary" │
│Black │
│Other "Mary" │
│White │
│Black │
│Other "Ted" │
└─────────────┘
But I still don't understand the usage of TypeVariable
. Could you give me an example of that in TutorialD expression?
By the time arbitrary'
sees the AtomType
, all types should be fully resolved, so you shouldn't see any TypeVariable
s. I've been thinking about splitting AtomType
s into resolved and unresolved types, but I would need to do some refactoring first.
Thanks for your hard work! I'll test out your branch today.
You are welcome! Take your time, and thanks for the great project!
I fixed the attribute resolution via #205, so there are new possibilities parsing. In testing, I found a case that results in an error from within QuickCheck:
TutorialD (master/main): createarbitraryrelation y {spam Int, spam2 Interval Int} 3-100
ERR: UnhandledExceptionError "QuickCheck.elements used with empty list\nCallStack (from HasCallStack):\n error, called at ./Test/QuickCheck/Gen.hs:186:15 in QuickCheck-2.10.0.1-ALuGysu7txTB2VnkPyDuTq:Test.QuickCheck.Gen"
Could you take a look?
I also added some basic tests for this feature: cabal test test-tutoriald
.
It happens in arbitray' ConstructedAtomType
,
I assume Interval should be parsed as IntervalAtomType atomTy
, but it is parsed as a ConsturctedAtomType
.
TutorialD (master/main): createarbitraryrelation a {spam Interval Int} 3-100
result parsed by makeAttributeExprsP
[AttributeAndTypeNameExpr "spam" (ADTypeConstructor "Interval" [ADTypeConstructor "Int" []]) ()]
result while retrieving a TypeConstructor
in arbitray' ConstructedAtomType
Just (ADTypeConstructorDef "Interval" ["a"],[])
I expected the TypeConstructor
found at least have one DataConstructorDef
(Is my assumption right? data TypeConstructorName = DataConstructorDef1 ...), but Interval has none.
Hence while the program randomly choose a DataConstructorDef
by elements, it cause this empty list error.
Why is it not parsed as IntervalAtomType
? Is it deprecated?
What is the difference between ADTypeConstructor and PrimitiveTypeConstructor?
I'll look into it more tomorrow.
I see now, in test, you expect Interval to be (ConstructedAtomType "Interval" (M.singleton "a" IntegerAtomType)
I see- this may be another primitive AtomType bug. I'll take a closer look.
Okay, I see that now.
basicTypeConstructorMapping :: TypeConstructorMapping
basicTypeConstructorMapping = primitiveTypeConstructorMapping ++
maybeTypeConstructorMapping ++
eitherTypeConstructorMapping ++
listTypeConstructorMapping ++
intervalTypeConstructorMapping
intervalTypeConstructorMapping
is different from primitiveTypeConstructorMapping
.
atomTypeForTypeConstructor tCons tConss tvMap = case findTypeConstructor (TC.name tCons) tConss of
Nothing -> Left (NoSuchTypeConstructorError (TC.name tCons))
Just (PrimitiveTypeConstructorDef _ aType, _) -> Right aType
Just (tConsDef, _) -> do
tConsArgTypes <- mapM (\tConsArg -> atomTypeForTypeConstructor tConsArg tConss tvMap) (TC.arguments tCons)
let pVarNames = TCD.typeVars tConsDef
tConsArgs = M.fromList (zip pVarNames tConsArgTypes)
Right (ConstructedAtomType (TC.name tCons) tConsArgs)
So atomTypeForTypeConstructor
thinks it is a ConstructedAtomType
.
I noticed that primitiveTypeConstructorP is taken off. Now all the primitive types become a ADTypeConstructor
first, then turned back when atomTypeForTypeConstructor
.
I guess that's what you said eariler about more possibility parsing.
So, should Interval be a primitiveType
?
It can't be in primitiveTypeMapping
because it has an argument.
So it should be resolved form
ConstructedAtomType TypeConstructorName TypeVarMap
to
IntervalAtomType AtomType
in
atomTypeForTypeConstructor
similar problem:
TutorialD (master/main): :showexpr relation{a Interval Int}{tuple{a interval(3,5,f,f)}}
ERR: AtomTypeMismatchError (ConstructedAtomType "Interval" (fromList [("a",IntAtomType)])) (IntervalAtomType IntegerAtomType)
Interval
should ultimately become a ConstructedAtomType
since it is clearly composed of primitives. I chose not to implement it that way because of some special syntax around it. For example, Interval
does not have any visible data constructors and it prints and gets parsed (in CSV) specially. Since this current design confused me, I am working on making it a ConstructedAtomType
.
After I make changes in atomTypeForTypeConstructor
. Interval can be generated now.
TutorialD (master/main): createarbitraryrelation e {spam Interval Day} 3-5
[AttributeAndTypeNameExpr "spam" (ADTypeConstructor "Interval" [ADTypeConstructor "Day" []]) ()]
TutorialD (master/main): :showexpr e
┌───────────────────────┐
│spam::Interval Day │
├───────────────────────┤
│[1858-12-06,1858-12-06)│
│(1858-10-20,1858-10-20)│
│(1858-12-16,1858-12-16]│
│(1858-11-11,1858-11-11]│
└───────────────────────┘
But as you can see, there is some logic error in generating an Interval...haha
TutorialD (master/main): createarbitraryrelation e {spam Interval Day} 3-5
TutorialD (master/main): :showexpr e
┌───────────────────────┐
│spam::Interval Day │
├───────────────────────┤
│[1858-12-10,1858-10-20]│
│[1858-12-11,1858-11-18)│
│[1858-11-12,1858-12-05]│
└───────────────────────┘
it has two different values now.
Interval may need some restriction. The type in it can only be an instance of Comparable typeclass or something like that.
Yea, we'll need some Interval-specific code to make the arbitrary instance generate valid intervals. In the future, users may have to specify their own arbitrary instances.
Indeed, I'm feeling more aware that we need to support typeclasses for custom types. Such classes would be useful for import/export, validation, etc.
I should have a replacement Interval
type ready tomorrow.
Sorry, @YuMingLiao, I think our work criss-crossed. I finalized the changes necessary to make IntervalAtomType
no longer a primitive type. The work is on my version of your branch.
I see. Never mind. I'm just trying to understand more code.
Good job! Thanks.
Where should I look to fetch your version? Maybe I can try if Interval works as expected.
About typeclass, I feel the same way. If Atomtype for Interval can be gt
or lt
, Then it is not far from having a sorted DataFrame
.
The Interval
changes are on my arbitrary_tupleset branch. I can still reproduce the Gen
error using relation{a Interval Int}
. Which line in Arbitrary
causes the failure?
It's elements
in arbitrary (ConstructedType...
, about line 80.
it's because there is no DataConstructorDef for Interval. So elements dcDef
choose from empty list and cause error .
If Interval is ADTypeConstructor, shouldn't it be defined as
data Interval a = Interval a a BoolAtomType BoolAtomType
so it has at least one DataConstructor
?
I tried to add it.
intervalTypeConstructorMapping :: TypeConstructorMapping
--intervalTypeConstructorMapping = [(ADTypeConstructorDef "Interval" ["a"], [])]
intervalTypeConstructorMapping = [(ADTypeConstructorDef "Interval" ["a"],
[DataConstructorDef "Interval" [DataConstructorDefTypeVarNameArg "a",
DataConstructorDefTypeVarNameArg "a",
]])
]
But I can't add the BoolAtomType part because DataConstructorDefArg
doesn't have primitiveAtomType
data DataConstructorDefArg = DataConstructorDefTypeConstructorArg TypeConstructor |
DataConstructorDefTypeVarNameArg TypeVarName
(time to bed...good day/night!)
Ah, I understand. In this case, I chose not to include a public data constructor because that would allow construction of invalid intervals (such as Interval 1 0
). Users must use the AtomFunction
interval
to construct Interval
s which validates the values. This is similar to Haskell module exports/hiding.
Until we have typeclasses which would allow user-defined types to implement their own Arbitrary
instances, I think I will add a special case for Interval
s. I suspected a special case would be required anyway because dropping random values into an Interval
is likely to create invalid values anyway.
What do you think? Is there a better approach?
I added a special case to generate valid Interval
s in Arbitrary.hs
. Let me know what you think.
Sorry I couldn't give instant feedback. I don't have strong opinion about this special Interval thing. It can wait. But you implemented it so quick. And It works as expected! That's brilliant! Thanks!
TutorialD (master/main): createarbitraryrelation b {spam Interval Int} 3-5
TutorialD (master/main): :showexpr b
┌──────────────────────────┐
│spam::Interval (a::Int) │
├──────────────────────────┤
│Interval 11 30 False True │
│Interval -27 16 False True│
│Interval -22 -9 True True │
│Interval -8 1 False False │
└──────────────────────────┘
And recently I found fake It can generate readable and plausible things like names, id, address. Do you think it's worth implementing?
Fake
could be interesting for end-users, but the default types are generic. We could certainly offer additional types as useful add-ons. I think the additional types would be more useful with typeclasses, though.
Are there any more tests you can think of that would be useful?
what do you mean by additional types? I thought ConstructedAtomType already serves additional type.
I was thinking something like ...
createarbitraryrelation a {name Text, id Int, address Text, phoneNumber Text} 3-5
Then it will create a more readable, plausible relation than what the arbitrary'
function can give now.
I thought we could offer a PhoneNumber
type which would have an Arbitrary
instance to create phone numbers and so on.
Oh, I see what you mean now. That makes sense. The additional type can have some more property related to reality like PhoneNumber format validation.
and it can create arbitrary relation with additional types and let user make their own attribute names.
createarbitraryrelation a {employeeName Name, companyID ID, phoneNumber PhoneNumber} 3-5
So it is worth a try, right? easily creating more user case relation to play with? Let me dive into this...
Add-on types would certainly be useful, but they don't need to be part of this feature branch.
I was thinking that there could be a central repository for add-on types and schemas so that they could be easily loaded dynamically at runtime, but that's a future feature.
Okay then, you are right. And it seems odd to add AddOn types to ConstructedAtomType. I guess I just want to create more relations to play with. :p ... So I guess this is it. I've committed the final work and push to my feature/arbitrarau_tupleset. I'll pull a request.
Okay, I guess I don't need to pull new request because this request is already following my branch.
I pulled from your branch but it no longer compiles. It looks like arbitrary'
in Arbitrary.hs
disappeared or was moved. Do you see the same?
Oops. I have git pushed again. This time should be fine. Sorry for the inconvenience.
This is now merged. Thanks for your hard work on this feature- this is an unusual feature that other DBMSes don't have.
Glad to finish my first open source contribution here. Thanks for the support, mentoring and encouragement.
I created this pull request from your branch so that I could review the code in context. Feel free to create pull requests, too.