rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
97.91k stars 12.68k forks source link

Fix: Consider U+01C3(ǃ) as a misleading punctuation, not as alphabetic #104574

Open Super-Pizza opened 1 year ago

Super-Pizza commented 1 year ago

here's my reasoning:

1ǃ=2 // U+1C3 inserted here

errors with the wrong message:

error: invalid suffix `ǃ` for number literal
 --> src/main.rs:2:1
  |
2 | 1ǃ=2
  | ^^ invalid suffix `ǃ`
  |
  = help: the suffix must be one of the numeric types (`u32`, `isize`, `f32`, etc.)

error[E0070]: invalid left-hand side of assignment
 --> src/main.rs:2:3
  |
2 | 1ǃ=2
  | --^
  | |
  | cannot assign to this expression

For more information about this error, try `rustc --explain E0070`.
Noratrieb commented 1 year ago

Rust follows Unicode Annex 31 for what it considers valid identifiers, so changing this is out of scope for Rust. A better error message for this is welcome though :)

Extra fun fact: This character is actually (ab)used in crates in the ecosystem https://docs.rs/macro-vis/0.1.1/macro_vis/

Super-Pizza commented 1 year ago

I'm proposing

help: Unicode character 'ǃ' (Latin letter retroflex click) looks like '!' (Exclamation mark), but it is not
help: Unicode character 'ǀ' (Latin letter dental click) looks like '|' (Vertical line), but it is not
Super-Pizza commented 1 year ago

Can someone do a PR for this?