AmpersandTarski / Ampersand

Build database applications faster than anyone else, and keep your data pollution free as a bonus.
http://ampersandtarski.github.io/
GNU General Public License v3.0
40 stars 8 forks source link

`CLASSIFY A IS B` should be interchangeable with `CLASSIFY A ISA B` and `CLASSIFY B ISA A` #1309

Open stefjoosten opened 2 years ago

stefjoosten commented 2 years ago

What happened

I have two scripts: Script_ISA and Script_IS:

CONTEXT Script_ISA

RELATION r[X*A]
RELATION r[X*B]

CLASSIFY A ISA B
CLASSIFY B ISA A

RULE r;r~

ENDCONTEXT

Script_ISA compiles without errors. The following script doesn't:

CONTEXT Script_IS

RELATION r[X*A]
RELATION r[X*B]

CLASSIFY A IS B

RULE r;r~

ENDCONTEXT

Script_IS produces the following error message:

  Cannot disambiguate the relation: r
    Please add a signature (e.g. [A*B]) to the relation.
    Relations you may have intended:
    r [X*A] (/var/www/data/scripts/stefj/Script_52878b5e-ca3a-4465-9baf-6f60e4354a83/ScriptVersion_4f23d0c9-a5f2-4023-892c-99f9e26338f7/script.adl:3:1)
    r [X*B] (/var/www/data/scripts/stefj/Script_52878b5e-ca3a-4465-9baf-6f60e4354a83/ScriptVersion_4f23d0c9-a5f2-4023-892c-99f9e26338f7/script.adl:4:1)
  Note: Some cases are not disambiguated fully by design. You can read about
    this at https://github.com/AmpersandTarski/Ampersand/issues/980#issuecomment-508985676
ExitFailure 10

What I expected

I expected that both scripts would compile without errors. In the wake of the discussion in issue #1306, I hesitate to call this a feature or a bug. Since $A\subseteq B\wedge B\subseteq A\Rightarrow A=B$, I would expect that Script_ISA entails Script_IS. With this result, I'm not sure anymore about issue #1306... How do I explain this to students?

Version of Ampersand that was used

I did this on RAP, using Ampersand-v4.3.0 [32141f41cf2af837d59eb57bc73320f4ef48fa00:refs/heads/development], build time: 15-Aug-21 07:39:12 UTC

sjcjoosten commented 2 years ago

This is a bug, the first of the two scripts should give a disambiguation error. Will investigate.

stefjoosten commented 2 years ago

One more thing, @sjcjoosten. I figured out your challenge how to model that two distinct concepts must always have the same population:

CONTEXT Distinct
CLASSIFY A ISA G
CLASSIFY B ISA G
ENFORCE  I[A] := I[B]
ENDCONTEXT

How do I model the two concepts A and B to become aliases? Answer:

CONTEXT Alias
CLASSIFY A ISA G
CLASSIFY B ISA G
CLASSIFY G ISA A
CLASSIFY G ISA B
ENDCONTEXT

or shorter:

CONTEXT Alias
CLASSIFY A ISA B
CLASSIFY B ISA A
ENDCONTEXT

Note that the ENFORCE I[A] := I[B] would be redundant in CONTEXT Alias because it follows from the fact that the populations of A and B are equal at all times.

Disambiguation

In the former script (CONTEXT Distinct) I would expect a disambiguation error in the following:

CONTEXT Distinct
CLASSIFY A ISA G
CLASSIFY B ISA G
ENFORCE  I[A] := I[B]

RELATION r[X*A]
RELATION r[X*B]
RULE r;r~

ENDCONTEXT

In the latter script (CONTEXT Alias) I would expect no compiler errors in the following:

CONTEXT Alias
CLASSIFY A ISA B
CLASSIFY B ISA A

RELATION r[X*A]
RELATION r[X*B]
RULE r;r~

ENDCONTEXT

All of the above is exactly as Ampersand behaves now. It is just that the following yields a disambiguation error:

CONTEXT Alias
CLASSIFY A IS B

RELATION r[X*A]
RELATION r[X*B]
RULE r;r~

ENDCONTEXT
stefjoosten commented 2 years ago

@sjcjoosten Good discussion yesterday. We agreed on the following:

  1. This issue a bug. CLASSIFY A IS B should be interchangeable with CLASSIFY A ISA B and CLASSIFY B ISA A, in all situations (i.e. in situations where there are type errors, population errors, or no errors). I have renamed this issue accordingly.
  2. Script_ISA and Script_IS should either (a) both be correct or (b) both be rejected by the compiler because RULE r;r~ is ambiguously typed.
  3. There is no mathematical motivation to prefer (a) or (b). The motivation should therefore be methodological.
  4. All scripts above share that the population of concepts A and B must be equal.
  5. The discussion seems to narrow down to cycles in the concept graph.

So now I'm looking for the right methodological motivation for one or the other. Here are some arguments; one pro and one con:

  1. Throughout Ampersand we have adopted the principle: "If there is no formal reason to forbid it, allow it." Since a cycle in the concept graph has a sound interpretation (as aliases), let us apply this principle and allow it.
  2. An aliasing mechanism is methodologically totally different, so we don't want to pollute the type system with an implicitly defined aliasing mechanism. So let us forbid it. (I think this implies that we should forbid cycles in the concept graph altogether...).

@sjcjoosten, please add your other arguments. When our discussion stopped, yesterday, you had just dropped an example in the chat. However, I could not recover that chat, so maybe you can drop the example once again in this discussion thread.

@sjcjoosten I am also curious to know what our scientific papers Relational Heterogeneity Relaxed by Subtyping (DOI 10.1007/978-3-642-21070-9_25) and Type Checking by Domain Analysis in Ampersand (DOI 10.1007/978-3-319-24704-5_14) have to say about this matter... Maybe Jaap van der Woude has something to say about this?

stefjoosten commented 2 years ago

I think this relates to issue #1281.