Closed rfindler closed 5 months ago
It looks like this is because of the documentation link that block
gets. If that's the case, then maybe block
should have a syntax location that's just the colon. That'd match what shows up in the docs, as following the link to the docs ends up here.
Yes, the block
identifier in the S-expression encoding of Shrubbery does get a source location that spans the block. I experimented some time back with making the source location just the :
, but that didn't work out well for other reasons — like error reporting and @rhombus
typesetting, if I remember correctly.
The underlying difference between source locations in plain S-expressions (i.e., syntax objects that start out as S-expressions) versus the S-expression encoding of shrubbery is that the shrubbery encoding avoids attaching information to pairs. Attaching information to pairs is always a little fragile, and more so with the shrubbery encoding because pairs don't correspond to parentheses in the source.
I'm not sure of the right choices here. If may be that block
needs to have different source locations associated with it for different purposes, but it would be nice to avoid that kind of complexity.
It does seem like a few things are true:
What other degrees of freedom does that leave?
Maybe there is a distinction that we should be drawing between identifiers that the programmer wrote and implicit ones (which would include this and also #%app
on the racket side). If Check Syntax knew that it was looking at an identifier that has some different read-level manifestation, then it would know to treat it differently. Does that seem like a plausible direction to look for a solution?
Maybe there is a distinction that we should be drawing between identifiers that the programmer wrote and implicit ones
Ah, I think this is on the right track. It's not about "implicit" versus "explicit" though, but about structural identifiers in the S-expression representation of shrubbery versus shrubbery identifiers.
Now that I pay more attention to the example, I'd say the issue is that the literal block
that is intended for shrubbery representation as being treated as a block
identifier that might be bound. It just happens that block
used as part of the shrubbery representation matches the name of a binding in Rhombus. Normally, the block
s that start out in a shrubbery representation of a Rhombus program don't persist in an expanded program, but a quoted block
does.
Here's another example along those lines:
def op = 5
'+'
An arrow is drawn from +
to op
because the S-expression representation is (op +)
.
So maybe the shrubbery reader should include some property on identifiers like block
, op
, and group
to say that they're structural and not something that can be bound?
Another possibility is to use keywords like #:block
in a shrubbery representation, instead identifiers like block
. That would look noisy, though, and it require a lot of changes to the Rhombus implementation.
I see what you mean about keywords vs symbols and if there were information around saying "this identifier should really be treated as a syntax object that has a keyword inside, not a symbol inside" then I think we'd be in good shape (or, of course, if there were actually keywords inside).
Continuing the example above, I see that this:
def parens = 5
'(abc,
def,
ghi,
jkl)'
gets me an arrow from the parens
to the middle of the syntax object. Is that also a problematic arrow?
Here's a larger example that has a bunch of arrows that point into empty space. I'm including it only as it was the one I started this PR from but I think it confirms your diagnosis:
#lang rhombus/and_meta
import lib("racket/base.rkt")
import lib("racket/base.rkt"):
meta
as base_meta
annot.macro '-> ($d, ...) $r':
def [x, ...] = base_meta.#{generate-temporaries}(PairList['$d', ...]).to_list()
def len = ['$d', ...].length()
'converting(fun (v):
if base.#{procedure?}(v)
| if base.#{procedure-arity-includes?}(v,$len)
| base.#{chaperone-procedure}(v,fun($x :: $d, ...):
values(fun
| (r :: $r): r
| (bad): error("wrong result to function "), $x, ...))
| error("->: expected a function of arity " +& $len +& "\n got " +& v)
| error("->: expected a function\n got: " +& v))'
and, if your position your mouse just right you can see these ones (the leftmost and right most seem especially sketchy):
Looking at the expansion, the only abnormally sized identifiers are block
and parens
, but parens
always returns #f
from identifier-binding
so I think these arrows are probably all from block
.
Yes, the parens
example makes sense as the same sort of issue, since shrubbery-level parentheses are represented with a parens
symbol in the S-expression encoding. And I agree with your interpretation of the purple arrows to various floating points inside the giant green blob.
With this program:
it seems there are some implicitly inserted identifiers that have very large spans. I'm not sure if a change to Check Syntax is in order or to Rhombus so I wanted to open the issue to ask what folks think. The largest one in this program is a
parens
that starts at position 29 and goes to position 77, but this one doesn't seem to trigger big green blob. The second largest one is ablock
that goes from position 37 to position 76 which does seem to trigger a big green blob.