larcenists / larceny

Larceny Scheme implementation
Other
202 stars 32 forks source link

Infinite loop in Larceny's syntax expander? #782

Closed mnieper closed 7 years ago

mnieper commented 8 years ago

I tried to feed the following R7RS code (generated by my Rapid Scheme expander from an example program calculating the factorial of 3) into Larceny:

(import (scheme base)
    (scheme case-lambda)
    (scheme write))
(letrec
    ((g5_f
      (case-lambda
       ((g11 g10 g9 g8_x)
    ((letrec
         ((g15
           (case-lambda
        ((g12)
         (if (= g8_x 0)
             (g12 1)
             (g5_f
              (letrec
              ((g14
                (case-lambda
                 ((g13)
                  (g12 (* g8_x g13))))))
            g14)
              #f g9 (- g8_x 1)))))))
       g15)
     g11)))))
  (g5_f
   (letrec
       ((g17
     (case-lambda
      ((g16) (display g16) (newline) (if #f #f)))))
     g17)
   #f
   (quote ())
   3))

Instead of printing the expected result of 6, Larceny begins to constantly use 100% of one CPU and eats 3 GB of memory. Entering the debugger shows that r6rs-expander.sch is near the top of Larceny's call stack.

What's interesting and which may hint to the cause of the error: If I change the first appearance of #f in the program into g10, the error vanishes and Larceny happily prints 6 on the console.

NB: The parameters g10 and g9 are obviously irrelevant to computing the factorial here. They have been inserted by the code generator to be able to pass additional information (continuation marks) around.

mnieper commented 8 years ago

I simplified the example program by hand to make it easier to locate the error:

(import (scheme base) (scheme write))
(define (g5_f g11 g10 g9 g8_x)
  (if (= g8_x 0)
      (g11 1)
      (g5_f
       (lambda (g13)
         (g11 (* g8_x g13)))
       #f g9 (- g8_x 1))))
(g5_f (lambda (g16)
        (display g16) (newline))
      #f
      #f
      3)

Experiments showed that the following two versions of the program do work:

(import (scheme base) (scheme write))
(define (g5_f g11 g10 g9 g8_x)
  (if (= g8_x 0)
      (g11 1)
      (g5_f
       (lambda (g13)
         (g11 (* g8_x g13)))
       #f g9 (- g8_x 1))))
(g5_f (lambda (g16)
        (display g16) (newline))
      #t
      #f
      3)
(import (scheme base) (scheme write))
(define (g5_f g11 g10 g8_x)
  (if (= g8_x 0)
      (g11 1)
      (g5_f
       (lambda (g13)
         (g11 (* g8_x g13)))
       #f (- g8_x 1))))
(g5_f (lambda (g16)
        (display g16) (newline))
      #f
      3)
WillClinger commented 7 years ago

I have been unable to duplicate this bug, but I'll leave the ticket open a while longer.

mnieper commented 7 years ago

Thanks for taking a look at it.

I have checked again, and the bug remains:

$ uname -a
Linux hodge 4.8.0-30-generic #32-Ubuntu SMP Fri Dec 2 03:43:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
$ larceny -version
Larceny v1.1a1 (Jul  8 2016 01:16:30, precise:Linux:unified)
$ cat program.scm
(import (scheme base) (scheme write))
(define (g5_f g11 g10 g9 g8_x)
  (if (= g8_x 0)
      (g11 1)
      (g5_f
       (lambda (g13)
         (g11 (* g8_x g13)))
       #f g9 (- g8_x 1))))
(g5_f (lambda (g16)
        (display g16) (newline))
      #f
      #f
      3)
$ larceny -r7rs -program program.scm

After the last command, the terminal hangs. On another terminal I can inspect that larceny.bin takes 100% CPU time of one core, eating 33% of the available memory (8 GB).

WillClinger commented 7 years ago

Thank you. I can't duplicate it using an interactive R7RS REPL, but it's just as you say when the code is in a file and run as a program.

That is, of course, a clue.

WillClinger commented 7 years ago

This looks like an arcane compiler optimization bug. If a (useless) reference to g5_f is added at the end of the file, the bug goes away.

Which means, of course, the bug goes away whenever I try to disassemble the compiled code in any of the usual ways. I'm going to have to compile the file, edit the compiled code to make the code for g5_f accessible for disassembly, and then disassemble it. Editing compiled code is tricky, so I'm going to try some easier things first.

WillClinger commented 7 years ago

The bug goes away if interprocedural-constant-propagation is turned off.

Constant propagation is implemented by pass3folding.sch, which uses #f and #t to represent bottom and top (respectively) of the abstract domain. Constants are represented by quoted literals. Larceny's R5RS macro expander is careful to quote all literals, including #f and #t, but the R6RS/R7RS macro expander probably doesn't. I think the output of the R6RS/R7RS macro expander is run through the R5RS expander before constant propagation sees it, but I might be wrong about that, or there might be some special cases for top-level R6RS/R7RS programs and libraries that bypass the R5RS macro expander.

By the way, the bug is also present when the file is compiled as a library, so the bug can't be in Larceny's special-case code for top-level programs.

WillClinger commented 7 years ago

Fixed by changeset 96dbb72e502c6a3b7d6c57656e3463f1c7719739

The macro expansions were fine, but the bug was in interprocedural constant propagation. When constant propagation makes some argument of a known procedure useless, the compiler tries to remove those useless arguments in all calls to that procedure. The conditions for removing useless arguments have to be the same as the conditions for deleting the useless formal parameter. Those conditions were almost the same, but not quite. Under extremely rare circumstances, arguments were deleted but the formal parameter was not. That bug has probably been latent for decades.