Closed ericoporto closed 3 months ago
About IsScriptWordChar, somehow I did not realize this earlier, but this function is not correct for couple of reasons.
First, as you say, it should accept char
parameter, and second, using std::isalnum seems incorrect, because it may return positive for a number of extended characters, and also depends on current locale. But these are not suitable for a "script word".
The C# version of preprocessor looks more correct, where it does not use "is alphanumeric", but explicitly compares to the only valid range of chars that may be used in script code:
public static bool IsScriptWordChar(this Char theChar)
{
return ((theChar >= 'A') && (theChar <= 'Z')) ||
((theChar >= 'a') && (theChar <= 'z')) ||
((theChar >= '0') && (theChar <= '9')) ||
(theChar == '_');
}
I added the use of std::isalnum in 30829df017404a95bd2cdb26986ab1225ea1bee7 , because the previous in-compiler function was called is_alphanum
. Somehow it did not click in my head that it does not really mean to represent "alphanumeric" in a wide sense, but only in script code sense.
So the previous implementation was a correct one, it was just named wrong.
I adjusted, but actually the previous implementation also supported two additional characters
...
if (chrac == '\"') return 1;
if (chrac == '\'') return 1;
I think those are incorrect, so I redid it exactly like the C# one.
This is essentially https://github.com/adventuregamestudio/ags/commit/cc465235b5a20b2a5a377aa4140ee3cbbd6d66d8 (https://github.com/adventuregamestudio/ags/pull/2482) and https://github.com/adventuregamestudio/ags/commit/69704a20c7bb70e34609d6ec78d3549b2a2e8a9f (small part of https://github.com/adventuregamestudio/ags/pull/2180)
NOTE: I am modifying the IsScriptWordChar function which is used by the script compiler. It had an undefined behavior - the comment text is from cppreference website. It looks like it's still working but previous isalnum was returning true for non-latin characters, I don't think this breaks anything because in the script compiler this is used only once and we pass a
char
type variable instead of an int. I actually think this function should take achar
as parameter since both existing uses are only passing a char.