Different rounding inside/outside closure

q3cpma commented 1 year ago

Got this while debugging the nbody benchmark in https://github.com/smarr/are-we-fast-yet

Here's a simple reproducer:

(eval-when (:compile-toplevel :load-toplevel :execute)
  (setf *read-default-float-format* 'double-float))

(format t "~S~%~,17G~%"
        (get-fpu-mode)
        (+ 0.35590502561699244 (* -4.917272383210972 39.47841760435743 8.15178609465426E-5)))

(funcall (lambda ()
           (format t "~S~%~,17G~%"
                   (get-fpu-mode)
                   (+ 0.35590502561699244 (* -4.917272383210972 39.47841760435743 8.15178609465426E-5)))))

which prints

(:ROUNDING-MODE :NEAREST :OVERFLOW T :UNDERFLOW NIL :DIVISION-BY-ZERO T :INVALID T :INEXACT NIL)
0.34008027853208495
(:ROUNDING-MODE :NEAREST :OVERFLOW T :UNDERFLOW NIL :DIVISION-BY-ZERO T :INVALID T :INEXACT NIL)
0.34008027853208490

phoe commented 1 year ago

Likely a fault in the constant folding process somewhere - disassembling the function shows the 0...490 value already present as a constant there.

...
;;; (format t "~S~%~,17G~%" (get-fpu-mode) (+ 0.35590502561699244 (* -4.917272383210972 39.4784176043574
    (movq (@ '0.3400802785320849 (% fn)) (% arg_z)) ;    [55]
...

q3cpma commented 1 year ago

Makes sense, indeed.

q3cpma commented 1 year ago

Actually, no, it also happens with

(eval-when (:compile-toplevel :load-toplevel :execute)
  (setf *read-default-float-format* 'double-float))

(format t "~S~%~,17G~%"
        (get-fpu-mode)
        (+ 0.35590502561699244 (* -4.917272383210972 39.47841760435743 8.15178609465426E-5)))

(defun f (a b c d)
  (format t "~S~%~,17G~%"
          (get-fpu-mode)
          (+ a (* b c d))))

(f 0.35590502561699244 -4.917272383210972 39.47841760435743 8.15178609465426E-5)

and was originally from this line: https://github.com/q3cpma/are-we-fast-yet/blob/common-lisp/benchmarks/Common-Lisp/nbody.lisp#L107

xrme commented 1 month ago

This must be due to inconsistency in order of evaluation.

(defun f1 (a b c d)
  (format t "~,17G~%"
          (+ a (* b c d))))

(defun f2 (a b c d)
  (format t "~,17G~%"
          (+ a (* b (* c d)))))

(defun f3 (a b c d)
  (format t "~,17G~%"
          (+ a (* (* b c) d))))

f1 and f2 compile to the same code, i.e., group the multiplications from right to left, and produce the value 0.34008027853208490. f3 explicitly groups the multiplications from left to right, and produces 0.34008027853208495.

(f2 0.35590502561699244 -4.917272383210972 39.47841760435743 8.15178609465426E-5)
=> 0.34008027853208490
(f2 0.35590502561699244 -4.917272383210972 39.47841760435743 8.15178609465426E-5)
=> 0.34008027853208490
(f3 0.35590502561699244 -4.917272383210972 39.47841760435743 8.15178609465426E-5)
=> 0.34008027853208495

According to https://www.lispworks.com/documentation/HyperSpec/Body/12_aaa.htm either order is OK, but it seems weird that

(+ 0.35590502561699244
   (* -4.917272383210972
      39.47841760435743
      8.15178609465426E-5))
=> 0.34008027853208495

whereas

(funcall #'(lambda () (+ 0.35590502561699244
                         (* -4.917272383210972
                            39.47841760435743
                            8.15178609465426E-5))))
=> 0.3400802785320849 ;without trailing 5

q3cpma commented 1 month ago

That would certainly explain it (fun coincidence, I read that CLHS entry earlier today)! Maybe an interpreter/compiler difference? Not knowledgeable enough about CCL to tell if this is possible.

xrme commented 1 month ago

Cases of interest:

eval
constant folding
associativity in compiled code

In the eval case, we group the multiplications in left-to-right order. This is what happens when we evaluate

(+ 0.35590502561699244
   (* -4.917272383210972
      39.47841760435743
      8.15178609465426E-5))
=> 0.34008027853208495

at the top level.

In the constant folding case, ccl::fold-constant-subforms collects constant subforms in reverse order. That is what happens here:

(funcall #'(lambda () (+ 0.35590502561699244
                         (* -4.917272383210972
                            39.47841760435743
                            8.15178609465426E-5))))

In the compiled code case, a compiler macro on * transforms into a sequence of pair-wise multiplications, and does that in right-to-left order. That is what happens here:

(defun f1 (a b c d)
  (format t "~,17G~%"
          (+ a (* b c d))))

The spec allows us to process arguments of mathematically associative functions in any order we want, but it seems to me that we ought to do it consistently.

xrme commented 1 month ago

17ef5428bb354fcdce21fbfb6d2d93532be48200 addresses the constant folding case. We were accumulating constants by pushing onto a list. We now reverse the list before passing it to apply.

7ffb738c3c7d60370589616310a2914dfc404dbc addresses the compiler case. The compiler macro on * was pairing up terms from right to left. I changed it to pair up from left to right.

I think that I could argue that this issue isn't really a bug, but it seems good to me to be consistent about how we group terms.

xrme commented 1 month ago

Also see e62e177e3470688470ae6da7ecaaf6e9d5ee5557, in which we stop using %temp-cons and start using nreverse in fold-constant-subforms

Clozure / ccl

Different rounding inside/outside closure #422