Closed greghendershott closed 3 years ago
I looped back to look a little more.
The syntax given to check-syntax
for the lang has the wrong syntax-position
. Apparently a meta-lang causes the trailing whitespace not to be consumed before the next read
... or something like that?
Anyway it seems this doesn't originate in check-syntax
. The only question might be whether it ought to try to detect and compensate for this situation, and if so how.
Thanks @greghendershott ! Do you have a small, self-contained example that we might look at (and eventually add to a test suite somewhere) that we could use to argue over what's the right behavior?
Yes, I can "test-suite-ize" that example above -- wrap it inside a call to check-syntax
to produce an add-arrow
where the start-pos
is one less than you would expect.
I need to do something else now, but will post that back here later today.
p.s. From digging a bit more, I suspect that this originates with make-meta-reader
, and, possibly the not-consuming-whitespace thing is a deliberate feature. But that's as far as I got on that front.
The problem affects not only syntax-position
, but also syntax-span
and syntax-column
.
#lang racket
(define (get-info s)
(parameterize ([read-accept-reader #t])
(define p (open-input-string s))
(port-count-lines! p)
(define stx (third (syntax->list (read-syntax #f p))))
(cons (syntax-column stx) (syntax-span stx))))
(get-info "#lang racket")
; '(6 . 6)
; "racket" is at column 6, and has length 6
(get-info "#lang at-exp racket")
; '(12 . 15)
; " racket" is at column 12, and has length 15
@sorawee That's a great example thanks.
I wonder if I should close this issue, here, and instead open one on the racket repo?
I feel like it's probably not worth speculating how check-syntax might be able to work around this, without first understanding what's going on. (And I personally don't grok the meta-lang mechanism. Maybe this should be fixed in make-meta-reader
? Or maybe it's deliberately left to each meta lang to skip its definition of whitespace? Or something else entirely? I just don't know.)
Well, if the docs don't give you the answers you seek then I would say that that's a docs bug! (Not helpful, I know :))
The problem is with the the read-spec
argument of make-meta-reader
. By default:
read-spec extracts a non-empty sequence of bytes after one or more space and tab bytes, stopping at the first whitespace byte or end-of-file (whichever is first), and it produces either such a byte string or #f
And the source locations are computed from the location before and after read-spec
is called, which is the cause of the problem.
It is likely that user-supplied read-spec
will do something similar: consume white spaces, and then consume a token. So fixing this needs to be careful to not break user-supplied read-spec
.
One possible fix is to add another keyword argument skip-space?
that, when true, will consume all spaces first, before calling read-spec
. It must default to #f
to preserve backward compatibility.
at-exp
, s-exp
, etc. would then invoke make-meta-reader
with skip-space?
as #t
and with a custom read-spec
that does not consume whitespaces.
It is not ideal because the default options are still bad, but this is the best solution I can come up with while maintaining backward compatibility.
@sorawee Thanks for investigating this!
I see. As you quoted, the prose definitely talks about one or more leading spaces/tabs --- not zero or more --- and that's the default read-spec
implementation, too:
[read-spec
(lambda (in)
(let ([spec (regexp-try-match #px"^[ \t]+(.*?)(?=\\s|$)" in)]) ;; if this changes, the regexp in planet's lang/reader.rkt must also change
(and spec (let ([s (cadr spec)])
(if (equal? s "") #f s)))))]
I think your proposal is pragmatic.
One possible downside is it will be tricky for someone who wants to supply a meta lang that can work for both older and newer versions of make-meta-reader
(that either lack or have the proposed optional keyword argument).
That probably doesn't matter for at-exp
and things that ship with Racket. It might matter for third party ones. I realize people can make package version exceptions, but that seems kind of yucky to need to do just for this.
As another possible idea:
What if make-meta-reader
leaves read-spec
as-is --- both its default implementation, and correct user-supplied value, will continue to unconditionally expect one or more leading spaces.
The change: Before calling read-spec
, it uses set-port-next-location!
on the input port, to adjust it. Maybe it peeks the number of spaces/tabs, increments the loc by that number, keeps the spaces/tabs in the port, and hands it to read-spec
.
Or a similar approach?
Would that be smart clever... or terrible clever?
I think there's an even simpler fix; I made a PR on the racket
repo.
The arrow is between "vector" and "racket/base".
In Dr Racket GUI, hovering over the space before "racket/base" produces no arrow.
As expected.
The arrow is between "vector" and " racket/base" -- the latter includes a leading space.
In Dr Racket GUI, hovering over the space before "racket/base" produces the arrow.
You're unlikely to notice this in the Dr Racket GUI.
You might notice this if you use the
add-arrow
from-start
coordinate.I'm not sure how important this is. I only noticed it reviewing some data.
I glanced in
traversals.rkt
and didn't quickly spot how/where this happens. I'm deep into something else and can't detour now. So for now I'm just logging it here as a "someday/maybe" to-do.