Clozure / ccl

Clozure Common Lisp
http://ccl.clozure.com
Apache License 2.0
840 stars 105 forks source link

seemingly silly strength reduction miss #482

Open xrme opened 2 months ago

xrme commented 2 months ago
(defun foo (i)
  (declare (optimize (speed 3) (safety 0)))
  (declare (fixnum i))
  (the fixnum (* 2 i)))

(defun bar (i)
  (declare (optimize (speed 3) (safety 0)))
  (declare (fixnum i))
  (the fixnum (ash i 1)))

Disassembly for foo. We recognize that we can turn the multiply into a shift, but for some reason we call out.

   +0: 4c 8d 2d f9  (recover-fn-from-rip)
       ff ff ff
   +7: 55           (pushq (% rbp))
   +8: 48 89 e5     (movq (% rsp) (% rbp))
  +11: 56           (pushq (% arg_z))

;;; (* 2 i)
  +12: 48 8b 7d f8  (movq (@ -8 (% rbp)) (% arg_y))
  +16: be 08 00 00  (movl ($ 8) (% arg_z.l))
       00
  +21: 48 89 ec     (movq (% rbp) (% rsp))
  +24: 5d           (popq (% rbp))
  +25: ff 24 25 a8  (jmp (@ .SPBUILTIN-ASH))
       53 01 00

Disassembly for bar, where we open-code:

   +0: 4c 8d 2d f9  (recover-fn-from-rip)
       ff ff ff
   +7: 55           (pushq (% rbp))
   +8: 48 89 e5     (movq (% rsp) (% rbp))
  +11: 56           (pushq (% arg_z))

;;; (ash i 1)
  +12: 48 8b 75 f8  (movq (@ -8 (% rbp)) (% arg_z))
  +16: 48 c1 e6 01  (shlq ($ 1) (% arg_z))
  +20: 48 89 ec     (movq (% rbp) (% rsp))
  +23: 5d           (popq (% rbp))
  +24: c3           (retq)

The situation on the ARM is similar.

The function foo:

  (mov imm0 (:$ 19))
  (stmdb (:! sp) (imm0 vsp fn lr))
  (mov fn temp2)
  (vpush1 arg_z)                        ;[12]

;;; (* 2 i)
  (mov arg_y arg_z)                     ;[16]
  (mov arg_z '1)
  (ldmia (:! sp) (imm0 vsp fn lr))
  (spjump .SPbuiltin-ash)

The function bar:

  (mov imm0 (:$ 19))
  (stmdb (:! sp) (imm0 vsp fn lr))
  (mov fn temp2)
  (vpush1 arg_z)                        ;[12]

;;; (ash i 1)
  (mov arg_z (:lsl arg_z (:$ 1)))       ;[16]
  (ldmia (:! sp) (imm0 vsp fn pc))
xrme commented 1 month ago

It looks like acode-rewrite-mul2 changes the multiply into a shift, but it doesn't do enough work to know if can use (%nx1-operator fixnum-ash) instead of (%nx1-operator ash).

Maybe we could add acode-rewrite-ash to catch this case?