Open eugenesvk opened 1 year ago
I've enabled CI for pull requests. I think you need to rebase before it'll work.
For what I would describe as very little value
Yeah, that was always going to be the blocker, can't explain the value of using symbols for easy differentiation if it's not apparent
there's already been a fairly hefty maintenance burden with this: finding 2 separate bugs with more likely hiding underneath.
sure, and if it's a bug like some Braille char, that could just be left in without burdening you
As I expected this is not a simple change
Interesting, I've found the opposite, didn't think I could just copy&paste a few symbols and a couple of functions and make it work, already switched a syntax to using symbols, and it's been great
At this point the only way forward for getting more unicode characters would involve a small change that's trivially correct
That would be a huge waste of time. It's much easier to exclude a char here and there that bugs in practice rather than review every single char set against unclear set of rules
Like that Braille empty char, the Unicode standard specifically states it's not a space
• while this character is imaged as a fixed-width blank in many fonts, it does not act as a space
(and it doesn't separate words in Sublime). Now, I don't know the difference and surely not eager to find out for every char, there is simply no risk in leaving those in that would justify that kind of thorough review
already switched a syntax to using symbols, and it's been great
At this point, I'm just wondering what such a syntax would look like tbh. Mind sharing it?
It's not ready yet (and with a few bugs I've encoutnered may not get ready...), but here are a couple of examples:
and the follow-up rule with just an extra scope (part of string-c⁄-
), which is immediately apparent as c⁄-
(within a rule or within a scope) is the only thing that changes (though this is hampered a bit by the requirement to add those ugly #[]
escapes)
There is no way the measly ASCII can approach anything close to that expressive power
Looking at these examples, I feel compelled to point out that special symbols buy you little brevity (at non-zero costs). one¬two
is not that much shorter than one-neg-two
, but the latter is much easier for anyone to understand and type. Most identifiers in the provided examples wouldn't even double in length if converted to ASCII, using shortened words. So while it's true that special symbols can "compress" your text because there are more special symbols than letters, the degree of compression, while varying, would often be not worth it.
Alphabets have pretty good expressive power. As a strawman example, with 26 letters, just 3 letters give you 26 26 26 = 17576 possible combinations. But typing a few letters is much faster than copy-pasting a special symbol or picking from auto-complete if lucky enough to have that.
This depends on one's keyboard layout, typing habits, and so on. For different people, different approaches are more efficient or convenient. I've heard of developers who find it difficult to use a keyboard, prefer to mouse-click around, and dream of a world where some "AI" would auto-generate all the code for them. In such cases, special symbols aren't significantly different from ASCII. However, a significant number of developers (hopefully a majority?) are used to blind-typing, and when working on a syntax, would prefer to type (and read) something like -neg-
over ¬
.
In programming languages, many features exist to satisfy only a subset of the users. It's reasonable to say that if a significant portion of the users just really want to use special symbols, privately, then great, let them! However, in addition to internal complexity and maintenance cost, this creates the possibility that some will use this non-privately, in a collaborative environment, forcing others to deal with special symbols. And I'd rather not.
Looking at these examples, I feel compelled to point out that special symbols buy you little brevity
That's mostly because you ignore the examples that buy a lot of brevity, e.g. the one I gave you in the other comment
(at non-zero costs).
with non-zero benefit
one¬two
is not that much shorter thanone-neg-two
These things compound. And don't cherry-pick short examples to address the brevity claim.
, but the latter is much easier for anyone to understand and type
except that the relevant group is not "anyone", someone more specific, who could both understand and type the symbol easier
Most identifiers in the provided examples wouldn't even double in length if converted to ASCII, using shortened words
Then there are those that would triple in length. And all of them would lose in legibility. Color your ASCII red with 🛑
degree of compression, while varying, would often be not worth it.
And then it would often be worth it
. As a strawman example, with 26 letters, just 3 letters give you 26_26_26 = 17576 possible combinations.
Good that you understand it's an unrelevant strawman (why are you using longer names if 3 is enough?)
But typing a few letters is much faster than
you forgot the most obvious other option - it's much slower than typing a symbol if you've set your keyboard apps right. For example it's much faster for me to type ≈
than approximately-equal
copy-pasting a special symbol
yes, that is slow, gladly that's not the only option
or picking from auto-complete if lucky enough to have that.
You're lucky enough to have Sublime in the context of developing a Sublime syntax
This depends on one's keyboard layout, typing habits, and so on. For different people, different approaches are more efficient or convenient.
Yet you keep arguing against the approach that is more efficient
However, a significant number of developers (hopefully a majority?) are used to blind-typing, and when working on a syntax, would prefer to type
and they can continue to do so just like before
However, in addition to internal complexity
what's the complexity of an extra match table?
this creates the possibility that some will use this non-privately, in a collaborative environment,
That's simply false, you ignore the simple fact I already mentioned to you that this is already possible. What say your majority of developers used to blind-typing ASCII to this valid SBNF syntax:
main : ( ~( oh-my-i-m-forcing-someone-to-use-chinese) )* ;
世界你好='hello world'
oh-my-i-m-forcing-someone-to-use-chinese : 'foo'{世界你好} ;
So adding more symbols conceptually changes nothing
forcing others to deal with special symbols. And I'd rather not.
You'd rather yes - you want to force people to collaborate using only the approach you admit yourself is less efficient (for some). That's not a tenable general approach and it's also not the one that's currently implemented
The SBNF syntax doesn't correctly recognize unicode characters:
äclause : a ;
This is also missing U+2028
, U+2029
and U+061C
, which are control characters.
Allows using visually descriptive symbols to convey meaning that's very obvious like
and then have shorter, yet more readable rules
Or you could use side-aware quotes to signal start/end (visibility of ↓ might depend on your font)
Readme is updated, includes your recommendation against using it
SBNF syntax is also updated
Closes: https://github.com/BenjaminSchaaf/sbnf/issues/33