Closed peblair closed 8 years ago
P.S. After reading the spec, the main thing that still seems to be unsupported (readtable-wise) is the #<n>=<datum>
/#<n>#
syntax. Racket has utilities for this in the form of read-accept-graph
, but simply enabling that parameter is not enough to make it work. I just gave it a cursory inspection, though, so there might be a trick to enabling it.
Edit: I forgot about #!fold-case
and #!no-fold-case
also.
Thanks a lot for the help! This looks good. I left one comment, and could you also make sure all the files have trailing newlines? Make those minor adjustments, and I'm happy to merge this.
I removed the r7rs:
prefix (it was left over from when I had originally implemented an r7rs:read
and r7rs:read-syntax
pair of function) and added those EOF newlines.
I'm glad to have helped!
Looks good! I just cloned it, and I did discover a small problem: it doesn't seem to work on HEAD, and therefore it won't work on 6.3. Looks like it doesn't work with the new macro expander. I tried it with this program:
#lang r7rs
(import (scheme base))
(display #u8(1 2 3))
I got this error:
r7rs/lang/r7rs-reader.rkt:39:13: bytevector: identifier's binding is ambiguous
context.:
#(-458352200736611477 module) #(-458352200736611476 module r7rs-reader 0)
#(1755194 module) #(1755195 module anonymous-module 0)
matching binding.:
#<module-path-index:(r7rs/base)>
#(1755194 module) #(1755195 module anonymous-module 0)
matching binding.:
#<module-path-index:((quote #%kernel))>
#(-458352200736611477 module) #(-458352200736611476 module r7rs-reader 0)
in: bytevector
This type of thing is sometimes a little tricky—reader extensions that introduce identifiers is not as clean as it could be. You might need to fiddle with make-syntax-introducer
to add a new scope in order to keep the binding unambiguous.
(As an aside, this would be a perfect thing to write some tests for, but I admit I haven't gotten around to getting a good testing setup for this package yet, so I'm okay with holding off on that for now.)
Turns out the fix was pretty simple: delete all lexical context from the syntax returned from the reader extension (i.e. (datum->syntax #f <syntax goes here> <src-location>)
). It was documented here.
I also went ahead and threw in a function to add the readtable extensions to the REPL as well.
Hmm. I'm not completely convinced that's the right approach, but I'm not sure. Does it work if the bytevector reader syntax is used in a module in which the bytevector
binding is renamed or not available? I think we should try and maintain hygiene.
I agree with your love of hygiene, but the reader already does weird things with readtable expansions. Consider this broken piece of code:
#lang racket
(define quote +)
'(2 2)
(I had intended for this to expand to (+ 2 2)
, but it behaves much more poorly than that)
Is this not the same kind of idea?
EDIT: This better describes what I meant to demonstrate:
#lang racket
(define quote (λ(x) (+ x 2)))
'2
Yes, quote
, quasiquote
, unquote
, and unquote-splicing
are are unhygienic. But this is not a reader abbreviation in quite the same way (and those unhygienic extensions are holdovers from history). This program works just fine:
#lang racket/base
(define (vector . args) args)
#(1 2 3) ; => '#(1 2 3), not '(1 2 3)
We should use the better behavior given that we have the opportunity. I'm willing to figure out how to handle making hygiene work, though, if you'd like.
@AlexKnauth has a meta-lang that provides hygienic versions of quote
, quasiquote
, syntax
, etc. Looking to that as a guide may make it easier to figure out how to do this hygienically - https://github.com/AlexKnauth/hygienic-quote-lang.
Yes, I was thinking about that. It looks like this commit addresses specifically the problem we're talking about: https://github.com/AlexKnauth/hygienic-quote-lang/commit/d50b7fd2cff48a2889b85b00d2714d38c2f67fcf
However, I'm not completely sure that's right... we might not want to apply make-syntax-introducer
at all if we're running under the old macro expander, since I think using make-syntax-introducer
may mess with syntax-original?
or other properties. I'm not really sure, though. I should try and break Alex's package to see if it's a problem, and if not, it sounds like the way to go.
After some investigation it looks like:
quote
does make syntax-original?
return #f
while racket/base
's quote
returns #t
.I think maybe we should try to use Alex's approach and check the arity of make-syntax-introducer
, but let's just not apply it at all if it doesn't support any arguments. Let's see if that causes any problems.
You can use (make-syntax-introducer #t)
instead of (make-syntax-introducer)
to preserve syntax-original?
. You'll definately have to do that for the outer scope, but for the inner scope (make-syntax-introducer)
will be fine (Edit: (make-syntax-introducer #t)
will also be fine). The outer syntax object will not be original, but the pieces inside that you care about will be.
Edit: See also https://github.com/AlexKnauth/hygienic-quote-lang/issues/1#issuecomment-151468490
I ended up implementing this myself in 1d1b9b125268b8ea22b4d36befdefda50003af1c. I sidestepped the issue by just returning a bytevector as syntax instead of expanding to a function call, which is valid since Racket actually has reader syntax for bytevectors (#"foo"
).
Thanks for the effort, anyway, though, and sorry I didn't get to merge this in.
Oh, I forgot you could do that. Yes, that's definitely a better approach. Should it return an immutable byte vector instead of a mutable one though, since literals are supposed to be immutable?
The syntax system happily converts the byte string to an immutable one automatically, so I didn't worry about that. The R7RS standard also allows literals to be immutable, so that's perfectly conformant.
But is (read)
also supposed to return an immutable one?
Well, the spec doesn't actually say. It's lenient here, so pretty much anything goes. You could certainly make a case that read
and read-syntax
should work the same, but I didn't give it too much thought.
Feel free to look this over. I just thought I'd help contribute a bit.
Fair warning: the
#u8
readtable syntax works, but DrRacket does not highlight it as a non-error.