Open sjcjoosten opened 4 years ago
@stefjoosten could you re-title this into something without the word union-type? (I.e. give a non-mathematical description), and could you also confirm that this example demonstrates sort of what you decided you needed when writing Ampersand models?
(Stef did this on Apr 23rd 2020 and also reformulated the problem)
~I like this syntax. However I wonder if the concept expression has any added value. It could probably as well be omitted, I guess.~
~@hanjoosten what concept expression do you mean?~
Sorry, I realize I have been unclear, and the concern I have is slightly different too.
So let us look at your example (slightly modified the layout)
INTERFACE EmployeeContract : I[Employee]
BOX [ "Employee" : I
, "Contract" : contract
BOX [ "Start" : startdate
, "??" : I[FixedTermContract] BOX [ "End" : enddate ]
| I[PermanentContract] BOX []
]
]
The bindedExpression contract
could be any expression. In your example, you conveniently took a relation that has Contract
as the target concept. However what happens if the user would take another concept as the target. Let's take concept A
.
Further we define:
RELATION r[Employee * A]
RELATION s[A * Contract]
RELATION t[A * Contract]
Now a user could write another INTERFACE
statement, obviously with different semantics (a.o. no longer editable) too:
INTERFACE EmployeeContract2 : I[Employee]
BOX [ "Employee" : I
, "Contract" : r
BOX [ "Start" : s;startdate
, "??" : s;I[FixedTermContract] BOX [ "End" : enddate ]
| t;I[PermanentContract] BOX []
]
]
The point is, that the expressions s;I[FixedTermContract]
and t;I[PermanentContract]
could now become unrelated in all kinds of ways, making the semantics of the |
operator less obvious.
I would rather see that the context expressions in the related subinterfaces (related by the |
operator) would be limited to expressions of the form I[SmallerConcept]
. This allows for pattern matching too. We might even consider an extension like the case
statement in Haskell:
INTERFACE EmployeeContract : I[Employee]
BOX [ "Employee" : I
, "Contract" : contract
BOX [ "Start" : startdate
, "??" : case I of
FixedTermContract -> BOX [ "End" : enddate ]
| PermanentContract -> BOX []
]
]
Ah I love the case
keyword to introduce a list of interfaces that are separated in a different way from the ordinary ,
. I see your point that this syntax might be somewhat easier to come up with a semantics for. However, it would be at the expense of being harder to use and I don't believe that us having to think harder about its semantics is enough of a reason to restrict the syntax to concepts only. In particular, I think the restriced I
-only syntax suffers from almost all the same drawbacks as allowing the expressions s;I[FTC]
and t;I[PC]
, without its flexibility.
The additional properties of I
that you win to make the |
operator have "obvious semantics" is that the expressions are guaranteed to be partial functions (UNI
). You also guarantee that their targets are of a unifiable type, but that's something we can require (if we need to) in either case. What you lose is that users will be forced to introduce ISA
whenever an interface requires a case statement somewhere. Even in those cases where we would currently advice students not to use ISA
but to use relations instead (I believe ISA
is best reserved for permanent properties).
Furthermore, the semantics don't become "obvious" for arbitrary concepts either: the two concepts used could be partially overlapping and they might also not cover all cases. Instead, let's come up with a semantics in which the cases where those expressions areUNI
(and pairwise disjoint and that have a union that is TOT
) would indeed yield the "obvious semantics", but is generalized.
Another issue I want to raise is that if we were to convert your syntax to type-checked Ampersand, we would probably want to translate case exp of A -> .. | B -> ..
into something where the proper epsilon-conversion-relation is used. That is: caseof exp;eps1 -> .. | exp;eps2 -> ..
. Those look a lot like expressions to me and I would love to maintain the property that whatever we do with ISA
we can also do without ISA
(the generated code will change, but its behaviour should not).
I see your point. I think it would be great if we could achieve something like this. However, I also think that whatever syntax we end up with, we must be able at compiletime to check that we indeed we have disjunct sets, and that any possible target will be in exactly one of them. That way, at least we could warn the &-modeller that there is overlap or missing stuff, as in Haskells case
expression.
Let's talk semantics..
I believe that we need to come up with semantics for the following tuple:
( (exp1,IF1)
, (exp2,IF2) )
Here exp1
and exp2
are possibly restricted by a combination of syntax and our type checker (discussion ongoing). I believe whatever separates the union-types should be associative, so the semantics for the tuple will generalize to triples etc. I would also like the empty relation to be a left unit and I believe any semantics we'll come up with satisfy that, so this discussion should generalize to cases with 0 and 1 elements.
For an example in which the semantics are obvious, suppose exp1
is populated by (a,a),(b,b)
and exp2
is populated by (c,c),(d,d)
, then IF1
is shown if the root node is a
or b
, and IF2
is shown if the root node is c
or d
, and I believe it is safe to generalize and say that neither interface is shown if the root node is anything else.
In what follows, we populate exp1
and exp2
differently and see what choices we'll have to make.
Next, suppose there is overlap: exp1
is populated by (a,a),(b,b)
and exp2
is populated by (b,b),(c,c)
. The interesting case is when the root node is b
. I can see two possible types of behavior: either IF1
is shown because it occurs first, or both IF1
and IF2
are shown. I'll name these two variations FCFS (first come first serve) and WCABF (we can all be friends). For these specific populations, we can get one in terms of the other:
Suppose we build FCFS and a user wants WCABF, then the user can write the following instead:
( (exp1 /\ exp2,IF1 \/ IF2)
, (exp1,IF1)
, (exp2,IF2) )
(we never defined the union of two interfaces but I imagine it to mean repeating the interface fields of both and let's just assume the type checker accept this)
Suppose we build WCABF and a user wants FCFS, then the user can write the following instead:
( (exp1,IF1)
, (exp2 - exp1,IF2) )
I take the expressions exp1
and exp2
to be super simple, and the interfaces to be super involved, so I imagine WCABF semantics will result in less code duplication.
Next, I want to note that all of the above applies to heterogeneous semantics: if exp1
were (a,1),(b,2)
and exp2
were (b,2),(c,3)
we'd get the same two options and can repeat all of the above. But we are assuming that exp1
and exp2
have the same types in order to form the expressions exp1 /\ exp2
and exp2 - exp1
. Can we lift that restriction?
Let exp1
be populated by (a,a),(b,b)
and let's use (b,2),(c,3)
for exp2
. We can still assign the same semantics, but the conversion between the two is not so obvious anymore. I do see how we can get FCFS if we built WCABF. The user can write:
( (exp1,IF1)
, (exp2 - exp1;V, IF2) )
We cannot get WCABF if we build FCFS, because IF1 \/ IF2
is ill-typed. I tend to favor WCABF now.
If an expression is non-univalent, the question becomes how many times we wish to repeat an interface. Let's populate exp1
and exp2
again. Say exp1
has (a,0),(a,1)
and exp2
has (b,2)
as populations. The interesting case is what to do if the root node is a
. I can see two things happening: show IF1
for 0
only, or show IF2
twice, once for 0
and once for 1
. I'm inclined to call the former FCFS-2 and the latter WCABF-2. There is no way to write FCFS-2 in terms of WCABF-2 this time, because there is no way to get the first element of a set (or even any single element). The fact that FCFS-2 makes a choice of first element also makes me really dislike the idea of implementing FCFS-2, it does not feel clean. WCABF-2 would be buildable.
Like Han says in his previous comment, only going for the obvious case and giving the user trouble if he tries otherwise is appealing too. Currently, however, there is no way to tell Ampersand that two concepts are disjoint, though I don't think it would be difficult either.
I can see a problem with the original proposal:
INTERFACE EmployeeContract : I[Employee]
BOX [ "Employee" : I
, "Contract" : contract
BOX [ "Start" : startdate
, "??" : I[FixedTermContract] BOX [ "End" : enddate ]
| I[PermanentContract] BOX []
]
]
Expressions I[FixedTermContract]
and I[PermanentContract]
are generally not identities, not even properties, and not even endo.
Consider:
INTERFACE EmployeeContract : I[Employee]
BOX [ "Employee" : I
, "Contract" : contract
BOX [ "Start" : startdate
, "??" : bankAccount BOX [ "account number" : accNumber ]
| employee BOX ["employee name" : name]
]
]
If I understand Sebastiaan correctly, the fact that bankAccount
and employee
are disjoint (provable by the type system) makes this is equivalent to:
INTERFACE EmployeeContract : I[Employee]
BOX [ "Employee" : I
, "Contract" : contract
BOX [ "Start" : startdate
, "??" : I BOX [ "account number" : bankAccount;accNumber
, "employee name" : employee;name
]
]
]
The user cannot solve this with FCFS vs WCABF if bankAccount and employee are type incompatible. So should we then forbid this? (with a type error saying that bankAccount
and employee
have different target concepts?).
My tuppence worth of thought is this:
In the old Ampersand, each box contains a list of triples: (label, expression, box) with an empty box if the user doesn't write one.
I suggest that each box contains a list of triples: (label, expression, mapping(concept->box)).
The case-syntax proposed by Han has that. Since a mapping is a set, I hope that the |
operator can be made associative and commutative given the right semantics. In the mapping, I would qualify double occurrences of the same concept as a type error.
As for type checking, I'd say that every concept $c$ in the mapping must be smaller than or equal to the target concept of the box expression to enable the compiler to write code that determines the selected case uniquely.
For every concept $C$ in the mapping, we can compute a uniqe path to the target of the box expression, which is the root of a typology in which all concepts from the mapping occur. Suppose $A\leq B\leq C$ is this path. Then I would suggest that the resulting interface is simply the union of all three interfaces. The union of interfaces may be something like (informal Haskell):
-- Note: this function is partial! The type checker needs to ensure its precondition.
-- Note: instead of lists, I would like sets in order to get associativity and commutativity of the interface union.
-- however, the order of labels on the screen may require some compromise here.
triples (b1|b2)
= transpose
( [ l1 | l1 <- labels b1, l2 `elem` labels b2, l1==l2]++
[ l1 | l1 <- labels b1, l1 `notElem` labels b2]++
[ l2 | l2 <- labels b2, l2 `notElem` labels b1]
, [ e1 \cup e2 | (l1,e1,_)<-triples b1, (l2,e2,_)<-triples b2, l1==l2]++
[ e1 | (l1,e1,_)<-triples b1, l1 `notElem` labels b2]++
[ e2 | (l2,e2,_)<-triples b2, l2 `notElem` labels b1]
, [ m1 \cup m2 | (l1,e1,_)<-triples b1, (l2,e2,_)<-triples b2, l1==l2]++
[ m1 | (l1,_,m1)<-triples b1, l1 `notElem` labels b2]++
[ m2 | (l2,_,m2)<-triples b2, l2 `notElem` labels b1]
)
The BOX-syntax is already delimited syntactically, so the union operator |
between boxes could be added without pain to unite boxes (without being "guarded" by concepts). The empty box can (if needed) be represented by BOX []
.
Your proposal does not avoid confronting users with the differences in semantics between WCABF and FCFS. In the 'overlap' section I have given example populations that are endo and properties. If your proposal did avoid confronting users with semantical difficulties, I'd be all for it. As it stands, I don't see a benefit to such a severe restriction.
Here's an example that shows the benefits of allowing for expressions. Suppose we want to see a mailbox a-la-outlook. The goal is to show a nice picture if your mailbox is empty, but to show email otherwise. I would be able to do that using expressions, writing something like this:
mail :: Mailbox * Mail -- describes which mails are in a mailbox
INTERFACE Mailbox : I[Mailbox]
BOX ["??": I /\ mail;mail~ BOX [ "mail": mail ]
| V[Mailbox*ONE]-(mail;V) BOX [ "picture" : ONE ]
]
Now the question is, can we do this using Han's syntax?
The only thing we are allowed to write are concepts, so we have the choice between writing Mailbox
and Mail
in front of the box. Mail
would give a type error, so we can only write something that looks like:
case Mailbox -> ..
Mailbox -> ..
That's obviously not going to work. This means that I'm forced to change my data-model if I want to support this. Indeed, with a different data-model, I am able to write an equivalent script:
mail :: Mailbox * Mail -- describes which mails are in a mailbox
INTERFACE Mailbox : I[Mailbox]
BOX [case I of EmptyBox -> BOX [ "picture" : I[ONE] ]
| FullBox -> BOX [ "mail" : mail ]
]
CLASSIFY FullBox ISA Mailbox
CLASSIFY EmptyBox ISA Mailbox
RULE FullBox : I[Mailbox];mail :- V[Fullbox*Mail]
MEANING "A mailbox is full if it contains at least one email"
RULE EmptyBox : V[Mailbox*Mail] - I[Mailbox];mail :- V[Emptybox*Mail]
MEANING "A mailbox is an emptybox if it contains no mail"
RULE FullBox : I[Fullbox];I[Mailbox];I[Emptybox] :- 0
MEANING "No mailbox is full and empty"
The reason I dislike this solution is that we now have a lot of semantic-model code (with rules and classify statements) that are only supposed to deal with interfaces.
Examples like the mailbox example is why I prefer to be able to write a flexible expression rather than a fixed concept.
Oh I really like this issue and the conversation.
I definitely have a use case for the WCABF and expression approach of @sjcjoosten.
In my application, you all know as Semantic Treehouse, I try to specify as much as functionality with Ampersand. But that results in very ugly interfaces, even worse generated APIs and burden to maintain consistency. Let me give an example:
In our application we have elements (let's call them A) that can be locked or not. This is modeled like:
CONCEPT A ""
RELATION locked[A] [PROP]
Now we need a single interface where the user can navigate to, but depending wether the element is locked or not, different content and editing functionality is shown. We currently do this like so:
INTERFACE "Example" : I[A] BOX <DIV>
[ "locked" : I;locked BOX [] -- interface without editing functionality
, "unlocked" : I-locked BOX [] -- interface with editing functionality for certain fields
]
The locked and unlocked interfaces have the same fields, but for some fields editing functionality is specified (i.e. CRUDs are different). In the current situation I have to copy the whole BOX structure.
Note that the top level box is a BOX <DIV>
. The div template doesn't show any markup, labels or anything. It can be used for these non-functional boxes.
With the proposed syntax of this issue, I could simplify this, e.g.:
INTERFACE "Example" : I[A] BOX
[ "r" : r cRud
, "s" : locked;s cRud
| I-locked BOX [ "s" : cRUd ] -- a box to edit 's'
]
This way I don't have to duplicate code and specify only the editing functionality additionally.
I'm not happy with the title change, I think the way we teach students to use concepts (namely that permanent properties can be encoded as concepts with an ISA, but temporary ones - like locked/unlocked, should not) is a good way to see concepts. Michiel's example shows that union types are not about specialized concepts: he uses union types in a way to talk about what is fundamentally a transient property
Edit: I see now that I am the one who asked for this title change... I guess I unknowingly changed my mind some time between July and September.
TL;DR: Aren't we making things too difficult here? Again new syntax to remember and teach (and (mis)use)? As @Michiel-s said, we already have <DIV>
(now <RAW>
and <RAW form>
).
The original example of @sjcjoosten was:
INTERFACE EmployeeContract : I[Employee]
BOX [ "Employee" : I
, "Contract" : contract
BOX [ "Start" : startdate
, "??" : I[FixedTermContract] BOX [ "End" : enddate ]
| I[PermanentContract] BOX []
]
]
B.t.w.: this syntax won't work because BOX []
isn't allowed, but that's nit-picking...
Here is how you can currently already do this (jan 2021):
INTERFACE EmployeeContract : I[Employee]
BOX [ "Employee" : I
, "Contract" : contract BOX <RAW form>
[ "generic contract stuff": I[Contract] BOX <FORM> [ "Start" : startdate ]
, "fixed term contract stuff": I[FixedTermContract] BOX <FORM hideOnNoRecords> [ "End" : enddate ]
, "permanent contract stuff": I[PermanentContract] BOX <FORM hideOnNoRecords> [ " ": TXT "No enddate (permanent contract)" ]
]
]
@RieksJ you might be right, and WCABF is semantically eerily similar to not implementing anything special. Coming back to Michiel's example, one almost-equal thing he could have written was:
INTERFACE "Example" : I[A] BOX
[ "r" : r cRud
, "s-locked" : locked;s cRud
, "s-unlocked" : I-locked BOX [ "s" : s cRUd ] -- a box to edit 's'
]
So how about allowing the following instead:
INTERFACE "Example" : I[A] BOX
[ "r" : r cRud
, "s" : locked;s cRud
, "s" : I-locked BOX [ "s" : s cRUd ] -- a box to edit 's'
]
That means: no new syntax to teach, just existing syntax to misuse. I propose only allowing duplicate labels if they appear consecutively, and drawing them indiscriminately whenever this happens. Semantically, that gives us union types.
We don't even need the allow-duplicate-labels feature. The script below already works now:
INTERFACE "Example" : I[A] BOX
[ "r" : r cRud
, "s" : locked;s cRud
, "s " : I-locked BOX [ "s" : s cRUd ] -- a box to edit 's'
]
The 'technique' of using spaces to differentiate between strings may not be very elegant, but it helps getting stuff done, e.g. as in the following:
INTERFACE "Example" : I[A] BOX
[ " " : TXT "<hr>" --- draw a first horizontal line
, "r" : r cRud
, " " : TXT "<hr>" --- draw a second horizontal line
, "s" : locked;s cRud
, "s " : I-locked BOX [ "s" : s cRUd ] -- a box to edit 's'
, " ": TXT "<hr>" --- draw a third horizontal line
]
Yes, but now there are two labels "s" instead of one in the interface.
Wondering why we had the restriction that labels within a box MUST be unique (rather than SHOULD be unique), and if we cannot lift that restriction. After all, for showing-on-the-screen purposes there is no reason to use 'MUST', whereas BOX-templates (e.g. PROPBUTTON, or the newly added OBJECTDROPDOWN and VALUEDROPDOWN templates - you should try them) will automatically enforce whatever they need to - including additional constraints such as required or optional labels.
That's because the labels within a box serve(d?) two purposes:
For purpose 1 I can see the 'MUST' requirement. For purpose 2 the 'MUST' requirement can be a (circumventable) obstacle - see the example above with horizontal lines. So I would vote for having 'SHOULD', and not enforce 'MUST'. Given that VIEWs and INTERFACEs are going to be unified, I can imagine one serving purpose 1 (enforcing the 'MUST' requirement) and the other serving purpose 2 (not enforcing 'MUST')...
This distinction is what issue https://github.com/AmpersandTarski/Ampersand/issues/708 is about
I agree that now that we identified that we have one thing (labels) with two purposes (internal json keys, interface labels), we are in need of separating the one thing into two things.
The purpose of this issue, when I opened it, was to deal with the internal keys issue. Over the course of this issue we came up with the idea of allowing duplicates there and giving it a WCABF semantics. Your horizontal lines example is a bit weird to use in the discussion about internal keys because the " "
internal key refers to an expression whose target is "<hr>"
, which is not an atom in the declared context, but it's a great illustration of the interface labels issue.
Issue #708 suggests separating the two purposes by having a LABEL
keyword, which would allow us to assign labels to relations, concepts, rules, and interfaces, but not to internal json keys.
Here is how you can currently already do this (jan 2021):
INTERFACE EmployeeContract : I[Employee] BOX [ "Employee" : I , "Contract" : contract BOX <RAW form> [ "generic contract stuff": I[Contract] BOX <FORM> [ "Start" : startdate ] , "fixed term contract stuff": I[FixedTermContract] BOX <FORM hideOnNoRecords> [ "End" : enddate ] , "permanent contract stuff": I[PermanentContract] BOX <FORM hideOnNoRecords> [ " ": TXT "No enddate (permanent contract)" ] ] ]
It is not entirely clear to me how this would work upon a create, but I'm happy to close this issue on this basis.
Creating contracts is a different situation, as typically you do not create a contract using a simple CRud (e.g. in between contract and
BOX
That's a good explanation for this particular situation, but what behavior would you expect from Ampersand if the user selects "Create Employee using EmployeeContract interface" or the likes? (Just because creating a contract using a simple CRud might not be typical, does not mean the user won't select it, or that we should forbid programmers from allowing it)
For any syntax r[A*B] CRUD BOX ...
I would expect that the creating an atom of would make it a type B
. Since your INTERFACE has multiple such structures, it depends on where the user clicks the create button.
Personally, I do not use the C for creating atoms in INTERFACEs for users any more (in APIs that's different). That is because I want my prototypes to be close to real applications, and for such purposes, creating an atom usually comes with consequences that go way beyond what C could do for me. So now I am in the habit of using BUTTONs to request for the creation of an atom, and ExecEngine rules to model the behavior of the creation process.
On 13 Sep 2020 @sjcjoosten wrote:
I'm not happy with the title change, I think the way we teach students to use concepts (namely that permanent properties can be encoded as concepts with an ISA, but temporary ones - like locked/unlocked, should not) is a good way to see concepts. Michiel's example shows that union types are not about specialized concepts: he uses union types in a way to talk about what is fundamentally a transient property
Edit: I see now that I am the one who asked for this title change... I guess I unknowingly changed my mind some time between July and September.
No problem. Let's change it back again....
Problem
Suppose we have a generic concept and two specific concepts, e.g.
How do we build an interface for
Contract
?The general problem is how to combine interfaces for specific concepts into an interface for a generic concept. Currently we have no operators to combine interfaces, so this calls for a language extension. This issue is to agree on some syntax for that.
Proposal
For example, an employee can be on a fixed-term and on a permanent contract, and the contract interface for such an employee may have two different looks. However, the concept that varies is Contract (which can be Permanent or Fixed-term):
My proposal is to use the operator
|
for this:The semantics would be similar to having a
,
there, with two subtle differences:,
does not allow you to use the same label twice|
implies that only one of the two (or more) options can be used, never both. For displaying (read) this should mean that the first of the two is given preference.For me personally, the need for
|
came up when working on Issue #1071 as I don't have a way to specify Haskell data-types with unions in them.