Closed sortraev closed 4 weeks ago
The purpose of the Unicode subscripts was to ensure that compiler-generated type variables would be completely distinct from programmer-written types. Programmers can write type names with underscores in them, but perhaps it's not a problem. The subscripts were hard to read with many fonts, so I've been considering getting rid of them for a while.
There are probably still places where the compiler will use Unicode in its output, and I don't consider that to be a problem. In 2024, not having Unicode in your terminal is as obsolete as not supporting lowercase characters.
Good points! I had not thought about your first point -- it makes sense to make that distinction in messages exposed to "regular" users, whereas e.g. futhark-dev
output is only seen by hackers.
I also agree with your second point, even if I personally prefer CLI programs to stick to ASCII whenever ASCII is sufficiently expressive (but this then ties into your first point, I guess, because I can't think of a sensible ASCII character which can be used in place of the underscore which would distinguish between source-level/compiler-generated names and not create other ambiguities)
Type variable names with subscripted tags are pretty, but if a user (in this case, a group of PMPH students trying to debug a bunch of size type mismatches) happens to use a unicode-incompatible terminal/editor (or is missing correct unicode fonts) then error messages can be confusing, especially if the missing unicode characters are shown as blank characters.
This small change makes type checker error messages universally printable/legible while preserving type variable names.
I do not know if there are other places/passes in the compiler which may print non-ascii characters (?).