ruricolist / serapeum

Utilities beyond Alexandria
MIT License
420 stars 41 forks source link

Require code that uses STATIC-LET to be compiled #114

Closed phoe closed 2 years ago

phoe commented 2 years ago

(Thanks for the review, @Hexstream. Can't comment in #113, so I post here.)

There's a missed case where STATIC-LET is allowed to not work if the code is not compiled at all. I haven't caught it since I only tested on SBCL, CCL, and ECL, all of which don't show this behavior, but there are other implementations on which it will work differently - see below.

I think it would be good to add this wart to the documentation strings of STATIC-LET and STATIC-LET*. The fallout will be limited, since most Lisp code is compiled (by ASDF or else), but this can still be a source of pain in interpreted code, e.g. when playing around in the REPL.

I'll add this wart to the STATIC-LET article.

Alternatively = is there a way to check at macroexpansion-time if a piece of code is being compiled via COMPILE-FILE or COMPILE, at which point we can signal an error rather than proceed? The former is easy, because compilation variables will be non-NIL, but the latter isn't something that I know how to solve yet.

Alternatively, is there a way to reliably use this behavior of the LOAD-TIME-VALUE to our benefit? To quote Thanos, we can use the stones to destroy the stones - like, we can try to use a possibly-uncompiled LOAD-TIME-VALUE at macroexpansion time and see if the value differs, and only then signal an error. How exactly to execute that and how to assert its portability is beyond my current knowledge though.

Rationale

CLHS 3.2.2.2 Minimal Compilation states (emphasis mine):

The first argument in a load-time-value form in source code processed by compile is evaluated at compile time; in source code processed by compile-file, the compiler arranges for it to be evaluated at load time. In either case, the result of the evaluation is remembered and used later as the value of the load-time-value form at execution time.

More, CLHS Special Operator LOAD-TIME-VALUE states (emphasis mine):

If a load-time-value expression is processed by compile-file, the compiler performs its normal semantic processing (such as macro expansion and translation into machine code) on form, but arranges for the execution of form to occur at load time in a null lexical environment, with the result of this evaluation then being treated as a literal object at run time. It is guaranteed that the evaluation of form will take place only once when the file is loaded, but the order of evaluation with respect to the evaluation of top level forms in the file is implementation-dependent.

If a load-time-value expression appears within a function compiled with compile, the form is evaluated at compile time in a null lexical environment. The result of this compile-time evaluation is treated as a literal object in the compiled code.

If a load-time-value expression is processed by eval, form is evaluated in a null lexical environment, and one value is returned. Implementations that implicitly compile (or partially compile) expressions processed by eval might evaluate form only once, at the time this compilation is performed.

COMPILE-FILE and COMPILE have a "must", where EVAL only has a "might". This means that functions defined using EVAL, without compilation, can cause a new object to be instantiated every time, which will both call the initialization form every time the body is entered and break all code that depends on the static binding values to stay static.

So, in order to get the behavior we want (in which an object is only allocated once), the code containing the STATIC-LET form must be compiled, at which point the load-time values will either be instantiated (in case of COMPILE) or stored in the resulting FASLs to be instantiated at load time (in case of COMPILE-FILE).

A quick test

(defun test-function ()
  (let ((counter-var (load-time-value (cons 0 nil))))
    (symbol-macrolet ((counter (car counter-var)))
      (incf counter))))

(format t "Possibly not compiled code: ~D ~D ~D~%"
        (test-function) (test-function) (test-function))

(defun test-function-2 ()
  (let ((counter-var (load-time-value (cons 0 nil))))
    (symbol-macrolet ((counter (car counter-var)))
      (incf counter))))

(compile 'test-function-2)

(format t "Compiled code: ~D ~D ~D~%"
        (test-function-2) (test-function-2) (test-function-2))

Implementations - both before and after COMPILE

SBCL 2.1.11

Possibly not compiled code: 1 2 3
Compiled code: 1 2 3

CCL 1.12

Possibly not compiled code: 1 2 3
Compiled code: 1 2 3

ECL 21.2.1

Possibly not compiled code: 1 2 3
Compiled code: 1 2 3

Clasp current

Possibly not compiled code: 1 2 3
Compiled code: 1 2 3

Implementations - only after COMPILE

ABCL 1.8.0

Possibly not compiled code: 1 1 1
Compiled code: 1 2 3

CLISP 2.49.93+

Possibly not compiled code: 1 1 1
Compiled code: 1 2 3

ACL 10.1 Express

Possibly not compiled code: 1 1 1
Compiled code: 1 2 3

LW 7.1.2 Personal

Possibly not compiled code: 1 1 1
Compiled code: 1 2 3
phoe commented 2 years ago

We can use the LOAD-TIME-VALUE to defeat the LOAD-TIME-VALUE.

The following code snippet will signal an error if LOAD-TIME-VALUE is unsafe to use for STATIC-LET:

(flet ((fn () (let ((x (load-time-value (list 0)))) (incf (car x)))))
  (when (= (fn) (fn))
    (error "STATIC-LET will not work in uncompiled code.")))

E.g. on CLISP:

[1]> (defun foo ()
       (flet ((fn () (let ((x (load-time-value (list 0)))) (incf (car x)))))
         (when (= (fn) (fn))
           (error "STATIC-LET will not work in uncompiled code."))))
FOO

[2]> (foo)
*** - STATIC-LET will not work in uncompiled code.
The following restarts are available:
ABORT          :R1      Abort main loop

Break 1 [3]> :r1

[4]> (compile 'foo)
FOO ;
NIL ;
NIL

[5]> (foo)
NIL

The main issue is that it'll need to be inserted for implementations which are not always-compiled, so CLISP, ABCL, LispWorks, and Allegro as demonstrated above. SBCL, CCL, ECL, and Clasp won't need that; I'm asking if ECL interprets code in some situations yes it always does.

If we add an inline declaration for fn up there, I guess it'll work fine.

ruricolist commented 2 years ago

I think if it's possible the best outcome would be to add a new static-load-time-value utility that can be used in place of load-time-value when identity is important. For example I suspect this problem might also affect the synchronized macro.

phoe commented 2 years ago

How would static-load-time-value work? It would need something like load-time-value over compile nil over lambda over load-time-value over make-static-binding in order to make it happen; I have no idea how to make that work yet, too many layers of abstraction even after a while of trying.


Also regarding #113 and the original form of FIRST-TIME-VALUE, I can't reproduce the error on CCL 1.12 - there was no regression test added with the commit at https://github.com/Hexstream/first-time-value/commit/90d3b0d5cb1c69c3b8ae17af18184a83c026c533 that would outline the faulting behavior. The original pre-patched code seems to work fine with the following simple test, I don't yet know where exactly that would fail.

Clozure Common Lisp Version 1.12 (v1.12) LinuxX8664

For more information about CCL, please see http://ccl.clozure.com.

CCL is free software.  It is distributed under the terms of the Apache
Licence, Version 2.0.

? (defmacro first-time-value (form)
    (let ((cache-var (gensym (string '#:cache))))
      `(let ((,cache-var (load-time-value (cons nil nil))))
         (if (car ,cache-var)
             (cdr ,cache-var)
             (prog1 (setf (cdr ,cache-var) ,form)
               (setf (car ,cache-var) t))))))
FIRST-TIME-VALUE

? (defvar *x*)
*X*

? (defun foo () (first-time-value (1+ *x*)))
FOO

? (let ((*x* 42)) (foo))
43

? (foo)
43

? (let ((*x* 0)) (foo))
43

Also, all of the current tests of FIRST-TIME-VALUE pass as-is on CCL 1.12 with the old version of the FIRST-TIME-VALUE macro. I tested six times, with three compiler policies (default compiler settings; high speed; high speed and zero safety) and two compiler frontends (COMPILE; COMPILE-FILE).

Whatever issue might have been there (possibly https://trac.clozure.com/ccl/ticket/1317 ?) seems to have been fixed as of that CCL version.

Preparation (with a simplified test framework that uses just ASSERT):

Clozure Common Lisp Version 1.12 (v1.12) LinuxX8664

For more information about CCL, please see http://ccl.clozure.com.

CCL is free software.  It is distributed under the terms of the Apache
Licence, Version 2.0.

? (defmacro first-time-value (form)
    (let ((cache-var (gensym (string '#:cache))))
      `(let ((,cache-var (load-time-value (cons nil nil))))
         (if (car ,cache-var)
             (cdr ,cache-var)
             (prog1 (setf (cdr ,cache-var) ,form)
               (setf (car ,cache-var) t))))))
FIRST-TIME-VALUE

? (defmacro first-time-values (form)
    `(values-list (first-time-value (multiple-value-list ,form))))
FIRST-TIME-VALUES

? (defmacro are (comp expected form)
    `(assert (,comp ,expected (multiple-value-list ,form))))
ARE

Actual tests:

? (let* ((eval-count 0)
         (nested (lambda ()
                   (lambda ()
                     (lambda ()
                       (first-time-value
                        (prog1 8
                          (incf eval-count))))))))
    (are equal '(8 8 1)
         (values (funcall (funcall (funcall nested)))
                 (funcall (funcall (funcall nested)))
                 eval-count)))
NIL

? (let* ((eval-count 0)
         (nested (lambda ()
                   (lambda ()
                     (lambda ()
                       (first-time-values
                        (multiple-value-prog1
                            (values 8 16 32)
                          (incf eval-count))))))))
    (are equal '(8 16 32 8 16 32 1)
         (multiple-value-call #'values
           (funcall (funcall (funcall nested)))
           (funcall (funcall (funcall nested)))
           eval-count)))
NIL

? (let* ((eval-count 0)
         (nested (lambda ()
                   (lambda ()
                     (lambda ()
                       (handler-case
                           (first-time-value
                            (prog1 8
                              (incf eval-count)
                              (when (<= eval-count 2)
                                (error "Uh oh."))))
                         (error () 'error)))))))
    (are equal '(error error 8 8 3)
         (flet ((call-it ()
                  (funcall (funcall (funcall nested)))))
           (values (call-it) (call-it) (call-it) (call-it) eval-count))))
NIL
ruricolist commented 2 years ago

As a proof of concept, this seems to work for LispWorks and CLISP:

(defmacro static-load-time-value (form &optional read-only)
  `(progn
     (flet ((fn () (load-time-value (random most-positive-fixnum))))
       (declare (notinline fn)) ; LW needs this.
       (unless (= (fn) (fn))
         (error "Cannot use ~s here" 'static-load-time-value)))
     (load-time-value ,form ,read-only)))
ruricolist commented 2 years ago

That is, it works in the sense that if you substitute static-load-time-value for load-time-value in test-function, it signals an error if you call test-function before it is compiled but works normally afterwards.

phoe commented 2 years ago

Oh - so you mean something that signals an error, rather than something that is guaranteed to always work. I misunderstood you then.

Sure, that works - add a more informative error message and it's good to go. Something about STATIC-LET or SYNCHRONIZED not being usable in code that hasn't been minimally compiled, fix with calling COMPILE or COMPILE-FILE on code that uses it.

phoe commented 2 years ago

An idea from #lisp on Libera Chat is to signal a warning rather than an error, since the code that depends on the binding will still work in evaluator/interpreter mode, just with different semantics (they will be that of LET rather than STATIC-LET).

It's not an unreasonable idea, even if personally I'm not a fan of it. IMO, STATIC-LET as an operator should guarantee to work the same way a closure would, and signal an error whenever it can't meet that guarantee.

phoe commented 2 years ago

There's an alternative implementation suggested by jasom on Libera Chat that doesn't use L-T-V at all, but instead accesses value cells of a generated symbol.

(defmacro static-let ((name init) &body body)
  (let ((x (gensym)))
    `(progn
       (unless (boundp ',x)
         (setf (symbol-value ',x) ,init))
       (symbol-macrolet ((,name (symbol-value ',x)))
         ,@body))))

(defun bar ()
  (static-let (x 0)
    (incf x)))

EDIT: And one more that uses locally special variables:

(defmacro static-let ((name init &optional (type t)) &body body)
  (let ((x (gensym)))
    `(locally
         (declare (special ,x))
       (unless (boundp ',x)
         (setf ,x (the ,type ,init)))
       (symbol-macrolet ((,name (the ,type ,x)))
         ,@body))))

(defun bar ()
  (declare (optimize (speed 3)))
  (static-let (x 0 fixnum)
    (incf x)))
ruricolist commented 2 years ago

Unfortunately using gensyms as above (or as in #113) isn't enough to preserve identity in code that isn't compiled, since a function that is just evaluated but not compiled can re-expand its macros on every call (and does on ABCL, at least!).

I've pushed a preliminary implementation of static-load-time-value to a branch of the same name. It moves the actual "Thanos" test out of line into a function with a no-op compiler macro expansion, so the test should be compiled out by any implementation that supports compiler macros. (I'm making the assumption that there is no implementation that expands compiler macros when just evaluating - I don't think that's too optimistic.)

phoe commented 2 years ago

I need to start thinking in terms of evaluated non-compiled code. You're right about gensyms.

The static-load-time-value branch looks good to me, as long as we can rely on compiler macros being always expanded by implementations. I'll need to do a test for all contemporary implementations to see if that is the case in practice. (Whew, it's time to ros install clasp-bin... Nope, that doesn't work.)

ruricolist commented 2 years ago

If the compiler macro isn't expanded, the worst that can happen is a little extra work. It's only invalid if there is an implementation that expands compiler macros when evaluating uncompiled code. IMO that would be against the spec, but it's probably still worth actually checking.

phoe commented 2 years ago

IMO that would be against the spec, but it's probably still worth actually checking.

It isn't against the spec. CLHS 3.2.2.1.3 states:

When eval encounters a form during processing that represents a call to a compiler macro name (that is not declared notinline), eval might expand the compiler macro, and might use the expansion in place of the original form.

But:

If the compiler macro isn't expanded, the worst that can happen is a little extra work.

OK, that works. LGTM.