egallesio / STklos

STklos Scheme
http://stklos.net
GNU General Public License v2.0
67 stars 17 forks source link

`LET` rewriter #650

Open jpellegrini opened 1 month ago

jpellegrini commented 1 month ago

Basically, nested LETs and LET*s are joined into a single LET*.

This is because going through several ENTER-LET instructions is slower than a single ENTER-LET-STAR.

We also try to turn LETs into LET*s when possible.

When do we not do this?

  1. When the bindings of one LET shadows the bindings of the other
(LET ((a 1))
  (LET ((a 2)) body1) body2)

We obviously cannot turn this into

(LET* ((a 1)
       (a 2))
  body1
  body2)
  1. When one individual binding of the inner LET makes a reference to a symbol defined previously in it.
    (LET ((a 1)
      (x (1+ a)) ;; a is NOT the one defined above!

We cannot turn this into

(LET* ((a 1)
       (x (1+ a))) ;; Oops, we now refer the the 'a' defined above!
  1. One of the LETs is a named LET.

  2. We do not rewrite SRFI-5 LETs.

  3. Finally: the expansion of LETREC is never optimized. This is because when we do so we break test 1.2 of the "R5RS Pitfalls" suite.

How is it done?

jpellegrini commented 1 month ago

Some timings:

First, an extreme case -- six nested LETs

(let ((x #f))
  (time
   (repeat 30_000_000
           (let ((a 1))
             (let ((b 2))
               (let ((c 3))
                 (let ((d 4))
                   (let ((e 5))
                     (let ((f 6))
                   (set! x a)))))))))
  x)
;; 2210.84 ms

(let ((x #f))
  (time
   (repeat 30_000_000
           (let* ((a 1)
                  (b 2)
                  (c 3)
                  (d 4)
                  (e 5)
                  (f 6))
                 (set! x a))))
  x)
;; 1201.03 ms

Now the simpler case: two nested LETs. Here the difference is really not large, but LET* is still faster.

(let ((x #f))
  (time
   (repeat 30_000_000
           (let ((a 1))
             (let ((b 2))
               (set! x a)))))
  x)
;; 981.54 ms

(let ((x #f))
  (time
   (repeat 30_000_000
           (let* ((a 1)
                  (b 2))
             (set! x a))))
  x)
;; 920.21 ms

All timings are average of 20 repetitions, and only on an amd64 machine.

jpellegrini commented 1 month ago

break test 1.2 of the "R5RS Pitfalls" suite.

Regarding this: we take the trouble to not optimize LETREC and insist in passing the test (!), but some mainstream Schemes do not pass that test:

jpellegrini commented 1 month ago

@egallesio this is ready -- passes all tests, and rewrites several LETs. It also seems to make the test suite run a tiny bit faster.

egallesio commented 1 month ago

Hi @jpellegrini,

I had not really studied carefully your code, and not even tested it :smile: However, In general, I'm not sure that your rewriting is equivalent. Here is an example

(let ()
  (define b 1000)

  ;;; <---------->
  (let ((a 1))
    (let ((b 2))
      (set! b 100)
      (print b))
    (set! b 200)
    (print b))   ;; This one is the external b
  ;;; <---------->

  (print b))

==> 
100
200
200

If we rewrite the internal let we'll have

(let ()
  (define b 1000)

  ;;; <---------->
  (let* ((a 1)
         (b 2))
    (set! b 100)
    (print b)
    (set! b 200)
    (print b))   ;; This one is the internal b
  ;;; <---------->

  (print b))

==>
100
200
1000

That is, the rewriting is OK in the let, but if you have side effects, it is possible that we do not set the good variable. Intuitively, I think it is OK iff <body2> is empty.

jpellegrini commented 1 month ago

Hi @egallesio - you're right! This is only valid if body2 is empty. Or, more precisely, if only one of the bodies of all LETs is non-empty... I can add that.

jpellegrini commented 1 month ago

Okay, I have checked. Indeed, there is no way to guarantee that the LET being rewritten is safe. I'll turn this into draft, but I think it isn't feasible...

jpellegrini commented 1 month ago

Okay, I have checked. Indeed, there is no way to guarantee that the LET being rewritten is safe. I'll turn this into draft, but I think it isn't feasible...

Although I have added some tests, including 3 nested LETs in the style you have suggested, and forbidding more than one non-empty body seems to pass them all. Anyway, maybe it's better to think about this more.

jpellegrini commented 1 month ago

Hi @egallesio ! You're right - there can be nothing after the body of the inner LET. I have changed the patch, and also included tests. I think it should work now...