Open dev-ardi opened 1 year ago
English itself is not consistent: https://en.wikipedia.org/wiki/Number_sign
I think Pound
as the name of the token is fairly common, "hashed" is the word to described something that the pound token. ("hashed raw string" sounds slightly better than "pounded raw string"). So I don't think there's anything changing here, the lingo is already in the ecosystem.
Or we rename everything to octothorp
to remove all possible confusion
Well as a non native english speaker Pound
refering to a symbol means £
to me!
As a counter argument hashed raw string without any context looks like it's a string that has been hashed :P Ultimately I don't care, I just thought that it was interesting and maybe someone else wanted to think about it.
All the names are potentially ambiguous. "Pound" meaning £
or ℔
, "hash" meaning hashing and "number sign" could be anything (e.g. perhaps the sign of an integer).
Though "hash" as in "hashtags" is at least pretty universal at this point.
My recommended change would be to change the variant name Pound
to Hash
Or we could call it Octothorpe
for less ambiguity
In this context (parsing) "number sign" and "hash" mean "#".
Hash is more widely used than pound: the latter is an American-ism that has specific relation to the symbol present on a phone dial pad. ITU E.161 defines it as "Viewdata Square", ⌗. Visually similar but semantically different. Same with octothorpe.
Let me add my two cents to this discussion.
First of all, there is one more possible name for the symbol: sharp. As in C# (the language) or as in musical notation.
So our candidates are as more or less: pound, hash, octothorp, number sign, sharp, square.
£
, but is the most popular name used in the code so far.+
and -
.♯
(U+266F), but that is quite a rare symbol, unlike £
, so the collision is not a big concern. Also #
& ♯
look similar, while £
is completely different, so I guess more people will understand what was meant by "Sharp" (chances are they already know about C#), while "Pound" might lead them in a wrong direction.There are also regional names like "hex" in Singapore (but that collides with hexadecimal) and "lattice" in Russian (which, imo, is a perfect name), but those are probably off the table.
Now, second of all, before I rush and open a PR renaming all the code occurrences to "sharp", there is still one more question that needs to be answered: is this inconsistent (internal) naming really an issue worth addressing?
I'm not in love with sharp. It only makes sense in the C# and F# case because well, those are musical notes!
I learned that #
was called Pound here first but since then I've seen it pop up in other places (mostly american sources) so while confusing at first, it's something that you need to learn at some point.
I think inconsistent naming is definitely worth addressing. Ignoring it could result in 8-9 different terms, including the Octothorp(e) variations. That would make the code much harder to understand and maintain.
I personally like Pound and Octothorp, but I think Hash and Hex are the only really confusing ones (for me at least).
Location
compiler/rustc_lexer/src/lib.rs
Summary
#
Token is refered to asPound
, but it's also refered to as hash sometimes.n_hashes
This is a really minor problem, but I just wanted to point it out.