jyc / lalr-scm

Automatically exported from code.google.com/p/lalr-scm
0 stars 0 forks source link

on glr status #7

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
In porting to R6RS Schemes I tried to understand the workings of the GLR
driver. I am a beginner on the subject; it is my understanding that, when
the grammar is non-ambiguous, the results with LR and GLR should be the
same. With the following terminals and grammar (a little modified calc
example):

  (define parser-terminals
    '(N O C T
    (left: A)
    (left: M)
    (nonassoc: U)))

  (define parser-non-terminals
    '((script   (lines)     : #f)

      (lines    (lines line)    : (yycustom $2)
        (line)      : (yycustom $1))

      (line (T)     : #\newline
        (E T)       : $1
        (error T)   : #f)

      (E    (N)     : $1
        (E A E)     : ($2 $1 $3)
        (E M E)     : ($2 $1 $3)
        (A E (prec: U)) : ($1 $2)
        (O E C)     : $2)))

the precedence of M above A is not respected. For example "1+2*3" gives 9
rather than 7. Is this correct, incorrect, or maybe a bug I have introduced
in porting?

Original issue reported on code.google.com by mrc....@gmail.com on 11 Aug 2009 at 10:11

GoogleCodeExporter commented 8 years ago
This looks really like a bug I introduced in porting. Trying the following with 
Guile
and the original code (the non-Snow distribution version) prints:

==> (1 * (2 * 3))
==> ((1 * 2) * 3)

which is correct, I guess.

(load "lalr.scm")

(define (display-result v)
  (if v
      (begin
        (display "==> ")
        (display v)
        (newline))))

(define eoi-token
  (make-lexical-token '*eoi* #f #f))

      (define (doit . tokens)
    (define calc-parser
      (lalr-parser
       (driver: glr)
       (expect: 0)

       (NUM LPAREN RPAREN
        (left: + -)
        (right: * /)
        (nonassoc: uminus))

       (output  (expr)          : (display-result $1))

       (expr    (expr + expr)           : (list $1 '+ $3)
            (expr - expr)           : (list $1 '- $3)
            (expr * expr)           : (list $1 '* $3)
            (expr / expr)           : (list $1 '/ $3)
            (- expr (prec: uminus)) : (list '- $2)
            (NUM)                   : $1
            (LPAREN expr RPAREN)    : $2)))

    (let* ((lexer       (lambda ()
                  (if (null? tokens)
                      eoi-token
                    (let ((t (car tokens)))
                      (set! tokens (cdr tokens))
                      t))))
           (error-handler   (lambda args #f)))
      (calc-parser lexer error-handler)))

      ;;right-associativity
      (doit (make-lexical-token 'NUM #f 1)
        (make-lexical-token '*   #f '*)
        (make-lexical-token 'NUM #f 2)
        (make-lexical-token '*   #f '*)
        (make-lexical-token 'NUM #f 3)
        eoi-token)

But then this:

      (doit (make-lexical-token 'N #f 1)
        (make-lexical-token 'A #f '+)
        (make-lexical-token 'N #f 2)
        (make-lexical-token 'M #f '*)
        (make-lexical-token 'N #f 3)
        eoi-token)

prints nothing and I can see that the GLR driver goes into an error action. This
should not happen, no?

Original comment by mrc....@gmail.com on 12 Aug 2009 at 3:24

GoogleCodeExporter commented 8 years ago
Further inspection revealed to me that when there is a conflict, and GLR is the
driver, lalr includes both the actions (both the shifts and both the 
reductions).
This inclusion will generate a fork in the process while the parser is running.

There is still one thing I do not understand. If "*" has precedence over "+", 
the
following results:

10 * 2 + 3 -> 23

but the following results:

1 + 2 * 3 -> 9, 7

that is, the first expression generates a single parse result, while the second
generates two results (with and without precedence). Is there a rationale for 
making
the first expression return a single result?

Original comment by mrc....@gmail.com on 21 Aug 2009 at 3:10