AmpersandTarski / Ampersand

Build database applications faster than anyone else, and keep your data pollution free as a bonus.
http://ampersandtarski.github.io/
GNU General Public License v3.0
40 stars 8 forks source link

feature: EQUALS statement #851

Open stefjoosten opened 5 years ago

stefjoosten commented 5 years ago

Issue #850 introduces a new statement: the EQUALS statement. This equates two named objects, so it is in fact a specific form of rule.

To be more precise, let me propose the following as the operational semantics of the EQUALS statement. I am expressing it in the context of FormalAmpersand (i.e. Ampersand's own metamodel).

RELATION equals[Relation*Relation]
RULE "Equate relations" : equals |- I
VIOLATION (TXT "{EX} MrgAtoms;Relation;", SRC I, TXT ";Relation;", TGT I )

Any violation that is triggered by this rule should prevent the further interpretation (such as code generation) of the namespace in which the violation has occured.

As a consequence, two relations that are made equal may have multiple signatures (i.e. name-source-target). A pair that exists in either relation now lives in the resulting relation, yielding the union of all pairs from both sides.

(We will not implement this until issue #850 is well on its way. So we intend to implement the EQUALS statement as a feature branch within feature 'Multi Context'.)

sjcjoosten commented 5 years ago

There are several things to think about:

For concepts, EQUALS is already in place:

CLASSIFY X IS Y

what happens here, is that X or Y will be chosen as the canonical concept name, and writing either becomes completely equivalent. As a data structure for concepts, the entire set of equivalent concepts is typically used in the type system itself.

I have not seen a syntax proposal yet; here is one that would be problematic:

EQUAL r IS s

The problem here is that if r and s are both relations, interfaces, rules and concepts, we have an ambiguous statement: which do we mean to be the same? So lets suppose that we have some way of knowing the kind of r and s. I'll add some syntax to the examples from this point on. Here kind is context, relation, concept, rule, service and maybe something else (per number 7 in issue #850 .

Another question is how to merge two things. In the context of Hawaii:

RULE guardPrerequisites1 : required;attends~ |- pass
VIOLATION (TXT "Course ", SRC I, TXT " must be passed by Student ", TGT I)

RULE guardPrerequisites2 : attends~ |- required\pass
VIOLATION (TXT "Student ", TGT I, TXT " cannot go to ", SRC I, TXT " without passing ", SRC required~)

EQUAL Rule guardPrerequisites1 IS Rule guardPrerequisites2

I can imagine that merging these rules would not be allowed because they have different types. The point here is that for every kind we need a good definition of what the merge operation does.

For relations it seems clear:

EQUALS Relation r IS Relation s

I will want to check that the types of r and s are indeed equal. This brings us to the checks that we may want to perform before we merge. For instance, some things of equal type cannot be declared equal:

EQUALS Concept BLOB IS Concept Integer

the problem here is that BLOB and Integer are built-in concepts, and they are to be mapped to different things in SQL. Similarly, some atoms can be declared equal, but the Integer 3 and the integer 1 cannot.

After working out what needs to be checked, and what merge operations we want, the main complication will be giving good feedback. Consider writing:

RELATION r[ONE*A]
RELATION s[ONE*B]
RELATION t[ONE*C]
EQUALS Concept A IS Concept C
EQUALS Concept B IS Concept C
EQUALS Relation r IS Relation s
EQUALS Relation r IS Relation t
EQUALS Rule x IS Rule y
Rule x : r = s~
Rule y : t = t~

A bad error message could read:

Cannot match C to ONE in Rule y : t = t~

Similarly, in the generated SQL, interfaces and documents, we want to use the preferred names over the canonical names where possible. To achieve this, the Haskell data-structure for a reference to a named object should carry:

Note that it might be possible for the preferred names to be equal, while the references are not:

RELATION r[A*B]
RELATION r[B*A]
RELATION s[B*A]
EQUALS Relation r IS Relation s
RULE x: r[A*B];r~ = r~;r

The preferred names in the rule above are r[A*B], r, r and r, respectively. The fully qualified sets are {r[A*B]}, {r[A*B]}, {r[B*A],s[B*A]} and {r[B*A],s[B*A]}, respectively (we may use a rich data-structures that includes the line of the declaration etc). As we allow for the combination of different scripts, these possibilities grow.

A final thing to think about is whether namespaces should be checkable in an incremental way: do we wish to allow a script that has a rule with y in it, while y is not in scope: that is, we allow for a script that cannot be checked individually, and can only be used inside a bigger context. There are a couple of advantages of doing this:

stefjoosten commented 5 years ago

What we discussed is to use the EQUALS statement only for concepts and relations. We could not come up with a proper use case for equating rules, yet a cartload of problems emerge if you do. I should have documented this restriction earlier.

If we decide to use CLASSIFY for equating concepts, we can restrict the use of EQUALS to relations. That way we don't have to complicate the syntax with the kind RELATION.

stefjoosten commented 5 years ago

Are there any syntactical problems for letting the following two code fragments be equivalent?

CLASSIFY A IS B
CLASSIFY D IS E
CLASSIFY P IS Q

and

CLASSIFY A IS B, D IS E, P IS Q

?

stefjoosten commented 5 years ago

Now suppose we have the following code fragment in a namespace:

RELATION r[A*B]
RELATION s[P*Q]
EQUALS r=s

The rule "Equate relations" invites to allow s[A*B], or r[P*B]. So the type checker will enforce (by means of type errors) that A=P and B=Q.

stefjoosten commented 5 years ago

Ampersand 3 (the current situation) has the following metamodel:

RELATION name[Relation*RelationName] [UNI,TOT,SUR] -- I must look up why this is surjective
RELATION sign[Relation*Signature] [UNI,TOT] 
RELATION src[Signature*Concept] [UNI,TOT]
RELATION tgt[Signature*Concept] [UNI,TOT]

I would expect the following in the syntactical structure:

RELATION signature[Relation*Signature] [TOT]
RELATION name[Signature*Identifier] [UNI,TOT]
RELATION type[Signature*Type] [UNI,TOT]
RELATION src[Signature*Concept] [UNI,TOT]
RELATION tgt[Signature*Concept] [UNI,TOT]

Somehow the latter feels better. However, I cannot find a compelling reason yet why we would have to change the metamodel... (Anyone?)

hanjoosten commented 5 years ago

Are there any syntactical problems for letting the following two code fragments be equivalent?

CLASSIFY A IS B
CLASSIFY D IS E
CLASSIFY P IS Q

and

CLASSIFY A IS B, D IS E, P IS Q

?

Here are my pro's and cons: Pros:

Cons:

RieksJ commented 5 years ago

On 23 Nov 2018, @sjcjoosten wrote

For concepts, EQUALS is already in place:

CLASSIFY X IS Y

what happens here, is that X or Y will be chosen as the canonical concept name, and writing either becomes completely equivalent. As a data structure for concepts, the entire set of equivalent concepts is typically used in the type system itself.

In order to properly integrate the new prototype framework and SIAM, we need real equivalence between the concepts PF_Role (from the new prototype framework) and Role (as defined and used extensively in SIAM and applications that use SIAM). According to what @sjcjoosten has written above, statement CLASSIFY Role IS PF_Role would make PF_Role and Role <QUOTE>completely equivalent</QUOTE>.

However, this is not the case. When inspecting the database we see PF_Roles that are not Roles, with corresponding failing behaviour in prototype functionality.

@sjcjoosten: can you fix this?

RieksJ commented 5 years ago

Related to the above comment is that when the SIAM module contains CLASSIFY Role IS PF_Role and when the SystemContext.adl file uses CLASSIFY PF_Role IS Role, then an error is generated saying there is a cycle. This is obviously not the case because both concepts are EQUALS.

RieksJ commented 4 years ago

Regarding syntax for EQUALS, I suggest to do this in a similar way as PURPOSE. Stating the equivalence of two relations r and s would then be phrased as

EQUALS RELATION r IS s

from which it is obvious that both r and s are relations.

However, a nice (and useful) extension would then be to allow s to be an expression, which would mean that the following syntax would be valid:

EQUALS RELATION r IS expression

Implementation is rather straightforward, as this (in case r has signature A * B) would be syntactical sugar for:

ROLE ExecEngine MAINTAINS "EQUALS - InsPair r[A*B] IS expression"
RULE "EQUALS - InsPair r[A*B] IS expression": expression |- r[A*B]
VIOLATION (TXT "{EX} InsPair;r;A;", SRC I[A], TXT ";B;", TGT I[B])
ROLE ExecEngine MAINTAINS "EQUALS - DelPair r[A*B] IS expression"
RULE "EQUALS - DelPair r[A*B] IS expression": r[A*B] |- expression
VIOLATION (TXT "{EX} DelPair;r;A;", SRC I[A], TXT ";B;", TGT I[B])

and the associated code would then need to be generated.