numbas / Numbas

A completely browser-based e-assessment/e-learning system, with an emphasis on mathematics
http://www.numbas.org.uk
Apache License 2.0
202 stars 117 forks source link

Allow more letters in variable names #787

Closed christianp closed 2 years ago

christianp commented 3 years ago

This is related to #690 in that it involves dealing with unicode.

I tried to use a letter ß in a variable name, but it was declared invalid. We should accept more characters in variable names.

JavaScript allows a huge range of characters. Quoting from an article by Mathias Bynens

An identifier must start with $, _, or any character in the Unicode categories “Uppercase letter (Lu)”, “Lowercase letter (Ll)”, “Titlecase letter (Lt)”, “Modifier letter (Lm)”, “Other letter (Lo)”, or “Letter number (Nl)”. The rest of the string can contain the same characters, plus any U+200C zero width non-joiner characters, U+200D zero width joiner characters, and characters in the Unicode categories “Non-spacing mark (Mn)”, “Spacing combining mark (Mc)”, “Decimal digit number (Nd)”, or “Connector punctuation (Pc)”.

I wonder if we could do the same. We might get in trouble with normalising: should the various characters that look like x all be treated as equivalent to x, or not?