racket / parser-tools

11 stars 20 forks source link

cannot use `start` or `atok` as terminals #7

Open mbutterick opened 6 years ago

mbutterick commented 6 years ago

This bug was first reported on my fork of parser-tools. As best I could tell, the problem occurs at this location, where this maneuver happens:

#`(grammar (start [() null]
                  [(atok start) (cons $1 $2)])
           (atok [(tok) (make-tok 'tok-id 'tok $e pos ...)] ...)))
#`(start start)

The solution I came up with — well, it passes all tests, I can’t claim to understand why it works — involves renaming some identifiers like so:

(with-syntax ([%start start-id-temp]
              [%atok atok-id-temp])
  #`(grammar (%start [() null]
                     [(%atok %start) (cons $1 $2)])
             (%atok [(tok) (make-tok 'tok-id 'tok $e pos ...)] ...)))
(with-syntax ([%start start-id-temp])
  #`(start %start))

I’d like to contribute a patch but there’s an ugliness I don’t know how to fix:

One can’t literally use %start and %atok as the new identifiers, because then %start and %atok can’t be used as terminals. Hence the use of the variables start-id-temp and atok-id-temp, which hold the temp identifiers. But where to get these identifiers?

“Dude, just use generate-temporary.” Turns out that won’t work. IIUC the parser spawns multiple threads. If these identifiers are different across threads, then the results can’t be recombined. So although the names don’t have to be consistent across every run of the parser, they apparently do have to be consistent among threads.

I have no idea how to make this happen. My so-terrible-it-works idea was to assign very long, very weird names for start-id-temp and atok-id-temp. But I’m not prepared to pollute the Racket codebase with that.

soegaard commented 5 years ago

Ping.