Allow cyclic dependencies in Concepts as synonym

AmpersandTarski / Ampersand

Build database applications faster than anyone else, and keep your data pollution free as a bonus.

http://ampersandtarski.github.io/

GNU General Public License v3.0

40 stars 8 forks source link

Allow cyclic dependencies in Concepts as synonym #999

Closed stefjoosten closed 4 years ago

stefjoosten commented 5 years ago

Problem

Ampersand-v3.17.2 and earlier prohibit cyclic dependencies in Concepts. Also, it prohibits multiple root concepts in a typology. However, as a user, I might have reasons to allow synonyms. For instance when combining different contexts (from different offspring) into a new context.

Solution intent

We could simply allow cyclic dependencies. All concepts in a cycle represent the same set of atoms, so effectively this means they will be synonyms of each other.

In the same effort, we can allow multiple roots (see issue #898). Note that this differs from allowing cyclic dependencies. It fits well in the philosophy of allowing things that have a reasonable interpretation. Nevertheless, a warning is in place, because sometimes people define multiple roots by mistake. The way to get rid of the warning is to define a concept for the union of the roots as described in issue #898.

stefjoosten commented 5 years ago

Impact:

CtxError.hs, line 153: mkCyclesInGensError Turn function mkCyclesInGensError into a warning.
CtxError.hs, line 166: mkMultipleRootsError Turn function mkMultipleRootsErrorinto a warning.
P2A_Converters.hs, line 377 Instead of calling mkCyclesInGensError, function mkTypology yields one concept with multiple names.
P2A_Converters.hs, line 391 Instead of calling mkMultipleRootsError, function mkTypology produces one new concept that is the union of its constituents, together with the ISA's between each constituent and the newly generated concept.
Allow concepts with multiple names. This affects the definition of A_Concept (AbstractSyntaxTree.hs line 754), which must accommodate multiple names. I would expect that P_Context need not have multiple names because that is a concept at parse-time. The P2A converter must collect multiple P_Concepts into one A_Concept.

hanjoosten commented 5 years ago

So what to do with

aConcept2pConcept :: A_Concept -> P_Concept?
instance Named A_Concept ?
instance Eq A_Concept ?
instance Ord A_Concept ?

stefjoosten commented 5 years ago

Hi Han, good questions!

To start with, what to do with instance Named A_Concept? As there may be multiple names, the function name must pick one. For this purpose, the A_Concept must not only store different names, but also the contexts that define those names. For choosing which name to print, I suggest names from the "current context" have a preference. For now, I just pick the head from a NonEmpty P_Concept.

For the instances Eq A_Concept and Ord A_Concept we leave the code as-is because it hinges on the hash attribute. We do not rely on the concept's name for equality, so there is no impact here:

instance Ord A_Concept where
  compare (PlainConcept{cpthash=v1}) (PlainConcept{cpthash=v2}) = compare v1 v2
  compare ONE ONE = EQ
  compare ONE (PlainConcept{}) = LT
  compare (PlainConcept{}) ONE = GT

instance Eq A_Concept where
  (==) a b = compare a b == EQ

Then what about aConcept2pConcept :: A_Concept -> P_Concept. It will become

aConcept2pConcept :: A_Concept -> NEL.NonEmpty P_Concept

So it yields a nonempty list instead of a single P_Concept.

Functions pCpt2aCpt :: P_Concept -> A_Concept gets a different setup, because a number of P_Concepts from different contexts can be assembled into one A_Concept, after a cycle of concepts is detected.

Michiel-s commented 5 years ago

Makes sense. I recognize this from the OWL ontology language.

“NOTE: If we wanted to "upgrade" an axiom of the form "A subClassOf B" to "A equivalentClass B" (meaning that the class extension of A is not just any subset, but in fact the same set as the class extension of B), we could add a second subClassOf axiom of the form (B subClassOf A), which by definition makes the two class extensions equivalent (and thus has the same meaning as "A equivalentClass B"). Such subClassOf "cycles" are explicitly allowed. As OWL is usable in a distributed environment, this can be a useful feature.” Source: https://www.w3.org/TR/owl-ref/#subClassOf-def

hanjoosten commented 4 years ago

I have implemented a nice solution to this problem, and created pull request https://github.com/AmpersandTarski/Ampersand/pull/1055 for it. The idea is that in the P-structure we harvest all concepts that are equivalent, based on cycles in the CLASSIFY statements. All such concepts will be considered as a single concept in the A-structure. All relations that have any of these concepts as source of target will have that single concept as source/target in the a-structure.

Some choices I made in this implementation:

There is no additional syntax created. This is on purpose. I expect that we need extra syntax when we implement namespaces. In designing a specification language, it is easy to create extra syntax, but it is close to impossible to get rid of it later on.
- There are no warnings generated. I agree with @RieksJ that this isn't required.
- ONE cannot have an alias
- SESSION can have aliases, but rules about SESSION remain in effect.
- There is no way to specify what the preferred name of the remaining concept is. I chose to take the smallest name (sorted alphabetically).
- I adapted the show function of A_Concept. It now shows not only the name, but all aliases as well. This could have effect on error messages.

stefjoosten commented 4 years ago

Great! I like this solution because it is upwards compatible and it maintains the rule in the metamodel that (A ISA B) /\ (B ISA A) <=> (A=B). (Currently, this rule is true but empty)

We will get some details to look after, which may pop up as issues in the future:

[ ] The choice of a concept name, e.g. when generating Ampersand-source code and when assembling error messages. The preferred name should be as close to the user as possible, so an alias in the user context (namespace) is preferred over an alias from a more generic namespace. (This is not relevant for the compiler, were any alias can be used to point to the concept.
[ ] No warnings for cycles is fine, because every cycle is now interpretable within the language as one concept, which has as many aliases as there are P_Concepts in the cycle.
[ ] In the drawings that are generated we want to see the aliases that are declared in the namespace(s) that are drawn.
[ ] I'm not sure about no aliases for ONE. I see no reason why the user could not rename it. Whether that is wise or not is for the user to judge. That would mean we might lift that restriction in the future without many consequences.
[ ] I'm sure that sooner or later someone (Rieks? Myself?) will ask to be able to specify the preferred concept name within a namespace. However, this is semantically neutral so I'm not sure whether it is a good idea to make one alias "more equal than others".
[ ] About issue #1004: I am not sure whether this feature will increase the number of colums in wide tables, even though Rieks' reasoning in this issue is plausible. Nevertheless, we should not let implementation details like #columns in a Maria database influence the language. We must solve the issue, simply because our current implementation is partial in this respect.
[ ] About issue #898: We could resolve this simultaneously. This would cause ANY set of classification statements in an Ampersand-script to be correct. It will also cause that all concepts within a typology have the same root. It will also ensure that the typologies form a partitioning of all concepts. Plenty of reasons to take this on...
[ ] The PHP-code in an ExecEngine rule mentions concepts. I think we will need some translation to the correct table name. We need to check whether this has any consequences. Are there automated tests that catch this?

hanjoosten commented 4 years ago

Aliassing of concept names now works and is merged into development.