racket / drracket

DrRacket, IDE for Racket
http://www.racket-lang.org/
Other
452 stars 94 forks source link

Meta lang causes arrow regions to start at the space before the module lang #486

Closed greghendershott closed 3 years ago

greghendershott commented 3 years ago
#lang racket/base
vector

The arrow is between "vector" and "racket/base".

In Dr Racket GUI, hovering over the space before "racket/base" produces no arrow.

As expected.


#lang at-exp racket/base
vector

The arrow is between "vector" and " racket/base" -- the latter includes a leading space.

In Dr Racket GUI, hovering over the space before "racket/base" produces the arrow.

You're unlikely to notice this in the Dr Racket GUI.

You might notice this if you use the add-arrow from-start coordinate.


I'm not sure how important this is. I only noticed it reviewing some data.

I glanced in traversals.rkt and didn't quickly spot how/where this happens. I'm deep into something else and can't detour now. So for now I'm just logging it here as a "someday/maybe" to-do.

greghendershott commented 3 years ago

I looped back to look a little more.

The syntax given to check-syntax for the lang has the wrong syntax-position. Apparently a meta-lang causes the trailing whitespace not to be consumed before the next read ... or something like that?

Anyway it seems this doesn't originate in check-syntax. The only question might be whether it ought to try to detect and compensate for this situation, and if so how.

rfindler commented 3 years ago

Thanks @greghendershott ! Do you have a small, self-contained example that we might look at (and eventually add to a test suite somewhere) that we could use to argue over what's the right behavior?

greghendershott commented 3 years ago

Yes, I can "test-suite-ize" that example above -- wrap it inside a call to check-syntax to produce an add-arrow where the start-pos is one less than you would expect.

I need to do something else now, but will post that back here later today.


p.s. From digging a bit more, I suspect that this originates with make-meta-reader, and, possibly the not-consuming-whitespace thing is a deliberate feature. But that's as far as I got on that front.

sorawee commented 3 years ago

The problem affects not only syntax-position, but also syntax-span and syntax-column.

#lang racket

(define (get-info s)
  (parameterize ([read-accept-reader #t])
    (define p (open-input-string s))
    (port-count-lines! p)
    (define stx (third (syntax->list (read-syntax #f p))))
    (cons (syntax-column stx) (syntax-span stx))))

(get-info "#lang racket")
; '(6 . 6)
; "racket" is at column 6, and has length 6

(get-info "#lang at-exp         racket")
; '(12 . 15)
; "         racket" is at column 12, and has length 15
greghendershott commented 3 years ago

@sorawee That's a great example thanks.

I wonder if I should close this issue, here, and instead open one on the racket repo?

I feel like it's probably not worth speculating how check-syntax might be able to work around this, without first understanding what's going on. (And I personally don't grok the meta-lang mechanism. Maybe this should be fixed in make-meta-reader? Or maybe it's deliberately left to each meta lang to skip its definition of whitespace? Or something else entirely? I just don't know.)

rfindler commented 3 years ago

Well, if the docs don't give you the answers you seek then I would say that that's a docs bug! (Not helpful, I know :))

sorawee commented 3 years ago

The problem is with the the read-spec argument of make-meta-reader. By default:

read-spec extracts a non-empty sequence of bytes after one or more space and tab bytes, stopping at the first whitespace byte or end-of-file (whichever is first), and it produces either such a byte string or #f

And the source locations are computed from the location before and after read-spec is called, which is the cause of the problem.

It is likely that user-supplied read-spec will do something similar: consume white spaces, and then consume a token. So fixing this needs to be careful to not break user-supplied read-spec.

One possible fix is to add another keyword argument skip-space? that, when true, will consume all spaces first, before calling read-spec. It must default to #f to preserve backward compatibility.

at-exp, s-exp, etc. would then invoke make-meta-reader with skip-space? as #t and with a custom read-spec that does not consume whitespaces.

It is not ideal because the default options are still bad, but this is the best solution I can come up with while maintaining backward compatibility.

greghendershott commented 3 years ago

@sorawee Thanks for investigating this!

I see. As you quoted, the prose definitely talks about one or more leading spaces/tabs --- not zero or more --- and that's the default read-spec implementation, too:

           [read-spec
            (lambda (in)
              (let ([spec (regexp-try-match #px"^[ \t]+(.*?)(?=\\s|$)" in)]) ;; if this changes, the regexp in planet's lang/reader.rkt must also change
                (and spec (let ([s (cadr spec)])
                            (if (equal? s "") #f s)))))]

I think your proposal is pragmatic.

One possible downside is it will be tricky for someone who wants to supply a meta lang that can work for both older and newer versions of make-meta-reader (that either lack or have the proposed optional keyword argument).

That probably doesn't matter for at-exp and things that ship with Racket. It might matter for third party ones. I realize people can make package version exceptions, but that seems kind of yucky to need to do just for this.


As another possible idea:

What if make-meta-reader leaves read-spec as-is --- both its default implementation, and correct user-supplied value, will continue to unconditionally expect one or more leading spaces.

The change: Before calling read-spec, it uses set-port-next-location! on the input port, to adjust it. Maybe it peeks the number of spaces/tabs, increments the loc by that number, keeps the spaces/tabs in the port, and hands it to read-spec.

Or a similar approach?

Would that be smart clever... or terrible clever?

greghendershott commented 3 years ago

I think there's an even simpler fix; I made a PR on the racket repo.