racket / rhombus

Rhombus programming language
Other
333 stars 59 forks source link

Giant green blobs in DrRacket #489

Closed rfindler closed 5 months ago

rfindler commented 6 months ago

With this program:

#lang rhombus/and_meta

'abc(fun (v):
       if 1 = 2
       | 3
       | 4)'

it seems there are some implicitly inserted identifiers that have very large spans. I'm not sure if a change to Check Syntax is in order or to Rhombus so I wanted to open the issue to ask what folks think. The largest one in this program is a parens that starts at position 29 and goes to position 77, but this one doesn't seem to trigger big green blob. The second largest one is a block that goes from position 37 to position 76 which does seem to trigger a big green blob.

rfindler commented 6 months ago

It looks like this is because of the documentation link that block gets. If that's the case, then maybe block should have a syntax location that's just the colon. That'd match what shows up in the docs, as following the link to the docs ends up here.

mflatt commented 6 months ago

Yes, the block identifier in the S-expression encoding of Shrubbery does get a source location that spans the block. I experimented some time back with making the source location just the :, but that didn't work out well for other reasons — like error reporting and @rhombus typesetting, if I remember correctly.

The underlying difference between source locations in plain S-expressions (i.e., syntax objects that start out as S-expressions) versus the S-expression encoding of shrubbery is that the shrubbery encoding avoids attaching information to pairs. Attaching information to pairs is always a little fragile, and more so with the shrubbery encoding because pairs don't correspond to parentheses in the source.

I'm not sure of the right choices here. If may be that block needs to have different source locations associated with it for different purposes, but it would be nice to avoid that kind of complexity.

rfindler commented 6 months ago

It does seem like a few things are true:

What other degrees of freedom does that leave?

Maybe there is a distinction that we should be drawing between identifiers that the programmer wrote and implicit ones (which would include this and also #%app on the racket side). If Check Syntax knew that it was looking at an identifier that has some different read-level manifestation, then it would know to treat it differently. Does that seem like a plausible direction to look for a solution?

mflatt commented 6 months ago

Maybe there is a distinction that we should be drawing between identifiers that the programmer wrote and implicit ones

Ah, I think this is on the right track. It's not about "implicit" versus "explicit" though, but about structural identifiers in the S-expression representation of shrubbery versus shrubbery identifiers.

Now that I pay more attention to the example, I'd say the issue is that the literal block that is intended for shrubbery representation as being treated as a block identifier that might be bound. It just happens that block used as part of the shrubbery representation matches the name of a binding in Rhombus. Normally, the blocks that start out in a shrubbery representation of a Rhombus program don't persist in an expanded program, but a quoted block does.

Here's another example along those lines:

def op = 5
'+'

An arrow is drawn from + to op because the S-expression representation is (op +).

So maybe the shrubbery reader should include some property on identifiers like block, op, and group to say that they're structural and not something that can be bound?

Another possibility is to use keywords like #:block in a shrubbery representation, instead identifiers like block. That would look noisy, though, and it require a lot of changes to the Rhombus implementation.

rfindler commented 6 months ago

I see what you mean about keywords vs symbols and if there were information around saying "this identifier should really be treated as a syntax object that has a keyword inside, not a symbol inside" then I think we'd be in good shape (or, of course, if there were actually keywords inside).

Continuing the example above, I see that this:

def parens = 5
'(abc,

  def,

  ghi,

  jkl)'

gets me an arrow from the parens to the middle of the syntax object. Is that also a problematic arrow?

Here's a larger example that has a bunch of arrows that point into empty space. I'm including it only as it was the one I started this PR from but I think it confirms your diagnosis:

#lang rhombus/and_meta
import lib("racket/base.rkt")
import lib("racket/base.rkt"):
  meta
  as base_meta

annot.macro '-> ($d, ...) $r':
  def [x, ...] = base_meta.#{generate-temporaries}(PairList['$d', ...]).to_list()
  def len = ['$d', ...].length()
  'converting(fun (v):
                if base.#{procedure?}(v) 
                | if base.#{procedure-arity-includes?}(v,$len)
                  | base.#{chaperone-procedure}(v,fun($x :: $d, ...):
                                                    values(fun
                                                           | (r :: $r): r
                                                           | (bad): error("wrong result to function "), $x, ...))
                  | error("->: expected a function of arity " +& $len +& "\n  got " +& v)
                | error("->: expected a function\n  got: " +& v))'

and, if your position your mouse just right you can see these ones (the leftmost and right most seem especially sketchy):

Screenshot 2024-03-19 at 6 15 13 PM

Looking at the expansion, the only abnormally sized identifiers are block and parens, but parens always returns #f from identifier-binding so I think these arrows are probably all from block.

mflatt commented 6 months ago

Yes, the parens example makes sense as the same sort of issue, since shrubbery-level parentheses are represented with a parens symbol in the S-expression encoding. And I agree with your interpretation of the purple arrows to various floating points inside the giant green blob.