Identifiers and labels/names of concepts, relations and other artifacts

stefjoosten commented 7 years ago

Here is a funny one.

I had a diacritical mark in the name of a relation. The compiler swallows it, but the database doesn't.

When installing the database, it said that the relation session was missing....

So don't use diacritics in your identifiers....

Michiel-s commented 6 years ago

Hmm. I think we should look into what’s happening here. It might be the case that the special character is not allowed as sql table or column name, and when running the application the first time installing the database fails. But maybe it is in the encoding of special characters used in the front end. Could you specify the character you used, than I can try to reproduce this issue and look into it.

RieksJ commented 6 years ago

Problems with 'weird' characters keep popping up... I suggest that names of relations and concepts should be forced to use characters from a limited character set, that exclude any character that might be 'special' in some case. The argument is that names of relations and concepts are there for the Ampersand engineer, and hence can well do without such characters.

Michiel-s commented 6 years ago

I agree. Many other languages support the distinction between the artefact identifier and the human readable name/label. e.g.

In RDFS/OWL you have the URI of a class (ampersand: concepts) or property (ampersand: relations) and the rdfs:label to specify the human readable label
ORM provides functionality for complex verbalization of Objects (ampersand: concepts) and Roles (ampersand: relations).

In ampersand we have the possibility to specify a "definition" or "meaning" for concepts, relations, rules and other artefacts. For the prototypes, the generator already creates two separate things, namely: an id (using the ANSI# method) and a name (using the concept name as specified in a script). Maybe it is time to restrict the names in ampersand scripts and add a consistent set of meta-data, including a human readable name/label. A suggestion for the Ampersand language:

CONCEPT "AConceptName"
LABEL "A concept name"
MEANING "Demo concept...."
PURPOSE "....."

@stefjoosten Could you specify the character you used, than I can try to reproduce this issue and see if my assumption about the sql table/column names is correct.

hanjoosten commented 6 years ago

I like the proposal of @Michiel-s . It doesn't seem too much effort to implement this. However, we should first think of a proper way to write this into the ampersand syntax. @stefjoosten , any suggestions?

RieksJ commented 6 years ago

Currently, there is a transformation going on between names used by &-developers and texts that end up in JSON files (for prototypes). This transformation is not always made at places where that is needed. An example is the ExecEngine function NavToOnCommit, which needs the id of the INTERFACE that the prototype should navigate to rather than the name of the INTERFACE as the developer has specified.

Since my guess is that this is not going to be fixed very soon, the algorithm that converts the name of ampersand artifacts into JSON stuff must be very simple, and very consistent. Currently, that is not the case: some non-alphanumeric characters are replaced with _nn_, where nn is the ascii code of the character. However, other non-alphanumeric characters have another way of being replaced. For example, _ is being replaced by __ (two underlines). This was mentioned in #816 as a separate issue, but redirected here.

Not solving the issue means that &-developers may find themselves running into errors that take a long time to diagnose, or solve.

I suggest that conversion between texts that &-developers type, and id's as are used by SQL, is done as follows: any character that is NOT in the set _[A-Z][a-z][0-9] is converted to _nn_, where nn is the (decimal) ascii representation of that character; other characters remain as they are.

This would mean that INTERFACE "[DEBUG]" becomes "id": "_91_DEBUG_93_", in JSON, and that INTERFACE "_[DEBUG]" becomes "id": "__91_DEBUG_93_", in JSON, and that INTERFACE "_DEBUG_" becomes "id": "_DEBUG_", in JSON

hanjoosten commented 6 years ago

As you suggested earlier (See above):

Problems with 'weird' characters keep popping up... I suggest that names of relations and concepts should be forced to use characters from a limited character set, that exclude any character that might be 'special' in some case. The argument is that names of relations and concepts are there for the Ampersand engineer, and hence can well do without such characters.

The purpose of this issue is to get rid of all those edge cases. This means that "[DEBUG]" is very likely to be disallowed, as well as your other examples.

Michiel-s commented 6 years ago

See my comment in #816 about why the underscore is escaped with an underscore instead of converted into an ascii code.

If it would be easier for you to be able to use the interfaces names as is (instead of the encoded ones) in e.g. exec engine functions, we can make it so!

Michiel-s commented 6 years ago

Action for myself:

[ ] make the exec engine functions ‘nav to on commit’ and ‘nav to on rollback’ to use the unencoded interface names.

LloydRutledge commented 5 years ago

An OU Rule-based Design student had a '_' in one name in the code and got this error when clicking the the link for the generated PDF:

ampersand: PandocPDFError "! Missing $ inserted.\n \n $\nl.122 \n"

Remove that one _ and everything's fine. I gather that's part of this larger problem.

The Rule-based Design students as a user group can easily stumble on this and not know what to do with the message. I've instructed the students to play it safe by only use alphanumeric strings for names, and at least to not use _ in names. If we can't fix this at the back-end code processing can with solidly prevent it at the front-end by having the parser be stricter with names?

stefjoosten commented 3 years ago

This issue has been resolved in Ampersand vs. 4.3.0. I'm not sure what we did to resolve it, but the behavior of the compiler is correct at the moment.

To prove this, I ran the following script on RAP:

CONTEXT Issue708

RELATION aá_p [A*B] = [("1","foo")]
RELATION aa_p [A*B]
RELATION aà_p [A*B] = [("1","bar")]

ENDCONTEXT

In the database I see three different relations, which is what I would expect:

Then I ran the following test to prove that these relations are all different:

CONTEXT Issue708

RELATION aá_p [A*B] = [("1","foo")]
RELATION aa_p [A*B]
RELATION aà_p [A*B] = [("1","bar")]

RULE one :   aá_p=aa_p
RULE two :   aa_p=aà_p
RULE three : aá_p=aà_p

ENDCONTEXT

I got the following behavior

For me, this is proves that this issue has been resolved.

stefjoosten commented 3 years ago

I will add this script as a test script, Issue708.adl, to the regression tests to ensure that it will fail.

hanjoosten commented 3 years ago

See this comment on why I reopened this issue.

AmpersandTarski / Ampersand

Identifiers and labels/names of concepts, relations and other artifacts #708