asoffer / Icarus

An experimental general-purpose programming language
Apache License 2.0
9 stars 2 forks source link

Identifiers must have an alpha-character in them. #44

Closed asoffer closed 2 years ago

asoffer commented 3 years ago

We use _ as a digit separator in numbers, and allow them to be placed anywhere, so 0_ would be considered a valid number, but _0 would be an identifier. This seems to be needlessly subtle.

We should change the rules so that an identifier must

These rules seem relatively straightforward and have the added benefit of making _ not a valid identifier (reserving it for future use like indicating ignored values or pattern matching).

perimosocordiae commented 3 years ago

Why not ban trailing underscores in numeric literals? Effectively this means underscores may only appear internally in numbers, as implied by the name "separator".

Then state: numbers must start with a digit, and identifiers must not start with a digit. Seems pretty clear to me.

asoffer commented 2 years ago

This still means that 12_3 and 1_23 are numbers but _123 is an identifier. I might be okay banning this, but I think it's best to demand that a sequence of characters containing only digits and underscores is not an identifier. It might be a number, and I'm okay banning some outright, but it shouldn't be an identifier.

perimosocordiae commented 2 years ago

Python supports _1 as an identifier, but I'm fine banning it.

asoffer commented 2 years ago

Interesting. That's pretty compelling. I didn't realize Python allowed underscore as a digit separator. With that in mind, I'm just going to close this and stick with what we have. The fact that Python represents prior art makes me think this is probably fine. The only differences I can discern is that Python doesn't allow trailing underscores or multiple underscores. But neither of those feel like the concerning bits here.