rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
96.64k stars 12.49k forks source link

Inconsistent naming for '#' #113159

Open dev-ardi opened 1 year ago

dev-ardi commented 1 year ago

Location

compiler/rustc_lexer/src/lib.rs

Summary

# Token is refered to as Pound, but it's also refered to as hash sometimes. n_hashes

This is a really minor problem, but I just wanted to point it out.

the8472 commented 1 year ago

English itself is not consistent: https://en.wikipedia.org/wiki/Number_sign

tgross35 commented 1 year ago

I think Pound as the name of the token is fairly common, "hashed" is the word to described something that the pound token. ("hashed raw string" sounds slightly better than "pounded raw string"). So I don't think there's anything changing here, the lingo is already in the ecosystem.

Or we rename everything to octothorp to remove all possible confusion

dev-ardi commented 1 year ago

Well as a non native english speaker Pound refering to a symbol means £ to me!

As a counter argument hashed raw string without any context looks like it's a string that has been hashed :P Ultimately I don't care, I just thought that it was interesting and maybe someone else wanted to think about it.

ChrisDenton commented 1 year ago

All the names are potentially ambiguous. "Pound" meaning £ or , "hash" meaning hashing and "number sign" could be anything (e.g. perhaps the sign of an integer).

Though "hash" as in "hashtags" is at least pretty universal at this point.

dev-ardi commented 1 year ago

My recommended change would be to change the variant name Pound to Hash

4gboframram commented 1 year ago

Or we could call it Octothorpe for less ambiguity

ds84182 commented 1 year ago

In this context (parsing) "number sign" and "hash" mean "#".

Hash is more widely used than pound: the latter is an American-ism that has specific relation to the symbol present on a phone dial pad. ITU E.161 defines it as "Viewdata Square", ⌗. Visually similar but semantically different. Same with octothorpe.

GrigorenkoPV commented 6 months ago

Let me add my two cents to this discussion.

First of all, there is one more possible name for the symbol: sharp. As in C# (the language) or as in musical notation.

So our candidates are as more or less: pound, hash, octothorp, number sign, sharp, square.

There are also regional names like "hex" in Singapore (but that collides with hexadecimal) and "lattice" in Russian (which, imo, is a perfect name), but those are probably off the table.

Now, second of all, before I rush and open a PR renaming all the code occurrences to "sharp", there is still one more question that needs to be answered: is this inconsistent (internal) naming really an issue worth addressing?

dev-ardi commented 6 months ago

I'm not in love with sharp. It only makes sense in the C# and F# case because well, those are musical notes!

I learned that # was called Pound here first but since then I've seen it pop up in other places (mostly american sources) so while confusing at first, it's something that you need to learn at some point.

5-pebbles commented 1 month ago

I think inconsistent naming is definitely worth addressing. Ignoring it could result in 8-9 different terms, including the Octothorp(e) variations. That would make the code much harder to understand and maintain.

I personally like Pound and Octothorp, but I think Hash and Hex are the only really confusing ones (for me at least).