Raku / problem-solving

🦋 Problem Solving, a repo for handling problems that require review, deliberation and possibly debate
Artistic License 2.0
70 stars 16 forks source link

Extended identifiers- reserving :sym puzzles me #225

Open fecundf opened 4 years ago

fecundf commented 4 years ago

There's a note at the bottom of https://docs.raku.org/syntax/identifiers which troubles me-

Starting with Raku language version 6.d, colon pairs with sym as the key (e.g. :sym) are reserved for possible future use.

That note creates an exception that I don't want to think about, and I imagine it will be the source of a hard-to-find bug years from now.

That note implies exemplary valid extended identifiers contains party:symbol and invalid extended identifiers contains party:sym ... I don't want to explain that to my kids.

edit- This came up when I was thinking about https://github.com/Raku/problem-solving/issues/224 and how XML::Actions uses methods in the form of namespace:tag - a tag named sym runs afoul of this possible future use.

fecundf commented 4 years ago

Raku must see a need to reserve a special extended identifier, hence the note, and it apparently isn't using it yet. Is this a problem for syntax to solve instead of adding reserved words? party:something-other-than-an-ordinary-identifer

raiph commented 4 years ago

Raku must see a need to reserve a special extended identifier, hence the note

Correct.

and it apparently isn't using it yet.

Incorrect, :sym is used very extensively in grammars.


In general, my advice is to not used extended identifiers except for uses explicitly mentioned in the doc.

Use of :sym<...> is mentioned in this section. The built in classes that handle processing of grammars treats :sym<...> as a special case.

The :sym<> mechanism has two aspects:

So:

grammar g {
  proto rule foo {*}
  token foo:sym<bar> { <sym> }
}

say so 'bar' ~~ / <g::foo> / # True

The built in grammars have taken advantage of this for around a decade. Users' grammars also do so. The :sym-is-reserved ship has all but sailed.

(I say "all but" because in theory someone with enormous commitment to do the enormous amount of work needed to convince folk to deprecate :sym<...> and switch to just :<...> might be able to pull it off over a 5 year or so period.)


Aiui, someone has already been using extended identifiers for method names. The interesting question there is whether there can be a clash between their usage and :sym<...>.

In a nutshell, I see it as asking for trouble if used for rule/method names in a grammar or actions class, and probably a non-issue if used otherwise (but note well my weasel word "probably").

First, let's say you put a rule in a grammar with :sym in its name, but no <...> after the :sym, but accidentally do write a <sym> in the pattern:

grammar g { proto rule foo {*}; token foo:sym<bar> { <sym> }; token foo:sym { <sym> } }

That generates a compile-time error. So at least you can't accidentally do that and not notice.

But you can write:

grammar g { proto rule foo {*}; token foo:sym<bar> { <sym> }; token foo:sym { bar } }

And that'll become one of the foo tokens considered to be candidates corresponding to the proto rule foo.

So, dodgy in grammars.

A similar issue arises in action classes.

But outside of those, I don't currently see how extended identifiers in method names can cause harm.

fecundf commented 4 years ago

Thanks for the background. The footnote reserving sym didn't mention a context, which made me think a declaration my $lexical:sym; would become invalid.

Reserving "...:sym" in the context of grammars is fine with me.

Back to the original footnote. Is this a documentation issue- should it say sym is reserved for grammars and actions-enshrining current practice-as opposed to reserving ":sym" everywhere?

raiph commented 4 years ago

@fecundf

should it say sym is reserved for grammars and actions-enshrining current practice-as opposed to reserving ":sym" everywhere?

I don't know what it should say. I'll let others decide that. If it were left to me, I think I'd leave the doc as it is. But that may just be because, with this comment, I think I've used up what energy I have to investigate it.

I do know that something like the footnote declaration (that :sym is reserved) is appropriate. Because it is used in a particular way in grammars. I wasn't aware of it being explicitly reserved. And I hadn't seen the footnote notifying that this is so, until this issue you've opened brought it to my attention. But it makes sense to me.

I also know that your suggestion would be wrong, or at least incomplete. But I only found that out by digging, as follows.


All Raku projects are stored in git repos, with a centralized online copy on github, and researchable using git features. So I used git's blame feature to dig out the original commit of that documentation line and it (unsurprisingly) turned out to be Zoffix.

The commit note provided further links, so I followed those too.


One patch showed a new error message.

I was curious and tried to invoke the error:

sub foo:sym<bar> {}

displays:

The :sym<> colonpair is reserved

Bingo. So it looks like Zoffix did land these patches.


It also looks like your suggestion ("say sym is reserved for grammars and actions") would be inadequate, because the above error is reserving them from use in sub identifiers.

Curiously I don't get the error for method or variable declarations:

my method foo:sym<bar> {}
my $foo:sym<bar> = 42;

These compile fine.

(These three -- including sub -- were just the first three identifier declaration things I tried. If you or anyone else really wishes to pursue this further, then a wider check would presumably be appropriate.)

This odd state of affairs suggests to me that Zoffix deliberately decided to disallow use of :sym in subs but deliberately allow them to remain in methods and variables (and perhaps other identifiers, such as for classes etc).

I do not understand the logic of this state of affairs


I also note that :<> is reserved ("Empty key was already reserved for custom ops categories only"). This seems to affirm my view that we could theoretically transition to that as the only reserved form. (Though, to reiterate, it feels like the effort would be out of all proportion to the gain.)


I wanted to try give you more useful info, and to investigate far enough to help anyone out who wishes to resolve this issue. I suspect my steam engine has now run out of road, or at least steam.

In closing, I see three options:

Good luck to whoever readers this whichever way things go. :)

fecundf commented 4 years ago

That's a great explanation, thanks for following up, examining the repo, and explaining!

As you noted, the extended syntax is already used in the wild, and is unlikely to clash with well-established grammar features.

I'm curious if Zoffix can elaborate on the intent as to exactly where ident:sym<...> is reserved? And the long-term plans for ident:<...> empty-string variations? Not sure how to tag in here, will send a twitter DM. In the meantime I have a guess-

  1. Cannot disallow methods in the form of ident:sym<...> because that would break grammar rules = methods
  2. Can disallow sub ident:sym<...>(...){...} to reserve sym<...> for future use-but is that needed?
  3. Variables with extended syntax have no meaning in grammars, so no need to reserve :sym<...> in there.
  4. Bare :sym does not conflict, so this works (and no conflict with that XML library) sub foo:sym ($thing) { say $thing } ; foo:sym "I work"
  5. Reserved nature only implemented at 1st level, this works sub deep:level:sym<ok>($a) { say $a } ; deep:level:sym<ok> "all good"
raiph commented 4 years ago

Zoffix asked for info on twitter said he was just going per what TimToady said, which he thinks ("Not 100% sure") was this exchange in 2017 on #perl6 IRC channel.

fecundf commented 4 years ago

The exchange in 2017 on #perl6 IRC channel is fascinating reading though I am not able to spend enough time on it now to grok it! I get the sense that Raku's grammar implementation as mentioned in the chat log precludes sub keyword:sym<foo> and when this caused errors, it was easiest to say "don't do that!" via reserving sub foo:sym<...>

I'd like to leave this open until I can read this more and understand it, or for someone else with interest in it to dive in. My sense is still that this could be explained better, or optimized. Or at least it will give me an excuse to read Raku grammar source.