Closed Jean-Luc-Picard-2021 closed 2 weeks ago
Ok, interesting!
Scryer Prolog allows OTHER_NUMBER in Prolog identifiers:
$ target/release/scryer-prolog -v
v0.9.4-55-gd6ac0355
$ target/release/scryer-prolog
?- X = π(x).
X = π(x).
?- X = π₂(x).
X = π₂(x).
SWI Prolog doesn't, not really a reason though.
The actual reason is that Trealla uses the C function iswalnum (mainly) as in:
while (iswalnum(ch)
#ifdef __APPLE__
|| iswideogram(ch)
#endif
|| (ch == '_')) {
and apparently C doesn't include OTHER_NUMBER in there.
The more I look into it the less reason I see to include it. In Maths & CS it is common to give identifiers designations like a' (eg a-prime) etc, which you can't do in Prolog either. I think it is a mistake for Scryer to allow it.
Interesting SWI-Prolog has a code_type/2
predicate,
that can also work with mode (+, -):
?- char_code('𝜆', X), code_type(X, Y).
X = 120582,
Y = csym ;
Etc..
How would I do that in Trealla Prolog? What I figured out, there is a predicate for mode (+, +):
?- char_code('𝜆', X), '$code_type'(X, lower).
X = 120582.
But there is no agreement how things are classified, SWI-Prolog thinks its "csym" what ever that means, and Trealla Prolog classifies it "lower", which is
closer to Unicode Categories. I think completely relying on Unicode Categories would have the advantage that it would give the perspective of
being consistent among Prolog systems. But it could be that there is no easy mapping to some common C-libraries.
while (iswalnum(ch)
I think this is not required. The problem is that OTHER_NUMBER has also some members which have fractional number values, or number values that are greater than a digit.
For example there is an OTHER_NUMBER:
⒙ Number Eighteen Full Stop https://www.compart.com/en/unicode/U+2499
So I do not classify it as digit in my system, and this here doesn't work:
?- number_codes(X, "₂₁").
Fehler: Keine Nummer.
user auf 1
But I allow it in identifiers. The code from Novacore that does that is here:
sys_type_class(11, is_ident).
Drawback of allowing all OTHER_NUMBERS in identifiers, we can now fake a period. This query works in my system:
?- X = ⒙(Y).
X = ⒙(Y).
I don't think super/sucb should be part of identifiers. If you want to use them and attach them to identifiers, make them postfix ops.
I think most of the Unicode classification efforts are rooted in the fact that different Languages around the Globe have different Scripts, and different writing directions left to right,
right to left, and then there are rules in certain Scripts, where the writing direction changes, and for example a number 123 is not displayed 321, but still 123 for some reasons.
This would explain why even exotic Unicode points have certain attributes stored in the Unicode database. I am currently trying to find out what algorithm Scryer Prolog is using.
For example it doesn't recognize this beast:
?- X = ⒙(Y).
X = ⒙(Y).
I have the feeling it has a criteria for what is digit like, but
unfortuately it has also no predicate code_type/2
. So I have
really no clue whats going on.
But I will probably do a revision of my algorithm, to get closer to Scryer Prolog, which seems to make quite some sense.
Just toying around:
What would be an argument against including OTHER_NUMBER in Prolog identifiers?