Closed alessandropellegrini closed 5 years ago
By allowing a contextual interpretation of word numerical literals, we introduce a really counter-intuitive behaviour where voglio antani, Necchi come se fosse mille
produces int antani = 1000;
, but mille a posterdati
emits std::cout << mille << std::endl
.
Also, I find confusing the fact that variables like mille
can be defined but cannot be used as rhs of an assignment. Expressions and statements requiring an expression should be orthogonal. Expressions must not depend on the specific statement they are being used in.
An important pitfall we would introduce is in the context of infinite loops:
Lei ha clacsonato
stuzzica e brematura anche, se uno
which does not produce do {} while(1);
as one would expect. Also, this produces a compile time error:
Lei ha clacsonato
voglio mille, Necchi come se fosse mille
mille a posterdati
since the two instances of mille
are grouped together and a posterdati
is left without an expression to print. Another regression is requiring a space after numerical operators. Spacing should not matter. Otherwise, splitting an expression on multiple lines produces an error:
Lei ha clacsonato
voglio antani, Necchi come se fosse 10 più
2 meno 1
All in all, I think that the cons of this addition outweights the pros. However, it makes sense to have string alias for commonly used values. Some possibilities I am considering are:
zero
, uno
, due
and dieci
. Other initializers are quite uncommon.Ok I see the points you have raised. So what if all the numbers in between zero and diecimila are interpreted as numbers in any context? I wouldn't like to have only some keywords recognized as numbers.
I can alter the patch with one single lexer rule without any starting condition. That would reduce the amount of variable names, but would make the whole stuff simpler and more consistent. What would you think in this case?
Closing for the lack of feedback
I think that in a supercazzola is useful to write number as strings. This patch add supports for stringified numbers in both variables assignments and shifts. Numbers from
zero
tonovemilanovecentonovantanove
are recognized so far. I have tried to alter as little as possible the existing lexer and parser, so something could be made better. For example, the syntax of numbers is not checked for, and a number likemillemilleundicidieci
is mapped to 2021. This could be fixed by using more complex rules in the parser. Furthermore, the rule to sum up numbers in the parser uses right recursion. This consumes a bit more stack, but it was done since this allows to modify as little as possible the existing grammar, and considering that numbers wont't be so long.As for the lexer, to avoid reducing the space of possible names, numbers live in a separate starting condition. This means that a sentence like:
voglio mille, Necchi come se fosse mille
is correctly interpreted (firstmille
is the variable name, seconmille
is recognized as 1000). Note that this implies that writing latervoglio antani, Necchi come se fosse mille
is not mapped to the previously-declared variable, but this is a design choice.To avoid clashes with future tokens, whenever a character which is not recognized as a possible string number is found, the lexer falls back to the INITIAL starting condition (the shift starting condition should be perfectly compatible with this behviour).
To allow for the recognition of string numbers, the appropriate starting condition should be explicitly set in the lexer. This means that, e.g., in
mille a posterati
, the stringmille
is not currently mapped to a number.