Open Danvil opened 9 months ago
See https://github.com/rust-lang/rfcs/issues/3402 and https://www.unicode.org/reports/tr31/proposed.html#Mathematical_Compatibility_Notation_Profile
@rustbot label T-lang -C-bug C-enhancement
This would need an RFC to extend the current identifier profile (the default one from UAX 31) to use the mathematical notation profile.
This would add these characters to the identifier profile (with the superscripts and subscripts not being allowed at the beginning of an identifier)
All of these would get linted on by the uncommon_codepoints lint since they have Identifier_Type=Not_NFKC
.
(A change I want to make is for uncommon_codepoints
to have slightly different lint text based on the category that is triggered: https://github.com/rust-lang/rust/issues/120228)
Hey if were getting mathematical characters can i just say i would really love it if we had
≥
≤
∈
∉
∣
∤
∋
∌
∅
∪
∖
∆
∩
¬
∧
∨
⊻
⊼
⊽
is there a unicode profile for these?
Also what about the emoji profile 😀
There isn't for the math operators because those are considered operatorlike.
As for emoji it's unlikely. Rust would have to put together its own set.
Emoji identifiers are a complicated can of worms.
Personally I'd be interested in math operators at least tokenizing, so they could reach macros as Punct
or such, but I figure that's somewhat unlikely to actually happen.
I tried this code:
I expected the code to compiles but instead I get the compiler error message "unknown start of token \u{2207}".
This is surprising as variable names starting with Greek letters are fine:
I believe the cause is that Rust identifiers need to start with a
XID_Start
unicode characters, however the "Nabla" ∇ (0x2207) does not seem to be on that list.It would be great to have the "Nabla" operator as a valid start token for identifier as it very commonly used in physics and mathematics to denote the derivative of a multi-variable function.
A possible workaround is to use the "Canadian syllabics e" ᐁ (0x1401).