egallesio / STklos

STklos Scheme
http://stklos.net
GNU General Public License v2.0
69 stars 17 forks source link

Optimize (inline) the `c_r` aliases to compositions of `car` and `cdr` #543

Closed jpellegrini closed 1 year ago

jpellegrini commented 1 year ago

1. Inline the cxr accessors

Currently STklos has all cxr aliases implemented as procedures:

(disassemble-expr '(caddar x) #t)

000:  PREPARE-CALL
001:  GLOBAL-REF-PUSH      0
003:  GREF-INVOKE          1 1
006:

Constants:
0: x
1: caddar

This patch optimizes this, opening their code:

(disassemble-expr '(caddar x) #t)

000:  GLOBAL-REF           0
002:  IN-CAR
003:  IN-CDR
004:  IN-CDR
005:  IN-CAR
006:

Constants:
0: x

Some timings:

(let ((a #f) (L '(a b c d e f g))) (time (repeat 50_000_000 (set! a (cadddr L)))) a)
before patch: 4174.988 ms
after patch:  2786.303 ms
(let ((a #f) (L '(a b c d e f g))) (time (repeat 50_000_000 (set! a (cadr L)))) a)
before patch: 3703.763 ms
after patch:  2479.088 ms
(let ((a #f) (L '((((a))) b c d e f g))) (time (repeat 50_000_000 (set! a (caaaar L)))) a)
before patch: 4387.891
after patch:  2802.235

Running the STklos tests seems to become very slightly faster (around 1%, average of 20 repetitions) - probably the speedup isn't great here because the cxr functions are not used that often in libraries, and also because there are slow tests (such as the Unicode tests) that don't use them.

2. Optimize list-ref for small constant indices

Since the cxr accessors have been optimized, we can use them to optimize some cases in list-ref. For example,

(list-ref L 2) => (caddr L)

list-ref becomes indeed faster:

(let ((a #f) (L '(a b c d e f g))) (time (repeat 50_000_000 (set! a (list-ref L 3)))) a)
before: 3828.612 ms
after:  2836.350 ms

(let ((a #f) (L '(a b c d e f g))) (time (repeat 50_000_000 (set! a (list-ref L 0)))) a)
before: 3562.374 ms
before: 2235.379 ms
jpellegrini commented 1 year ago

Yay! :)

egallesio commented 1 year ago

Hi @jpellegrini,

Thanks for this PR. I have merged it and modified a bit the generated code. I have added an instruction in the VM for cxr functions. Performances are better (but not a lot, and very less than I expected). The good news is that it permits to have better error messages. For instance,

stklos> (cadadr '(1 (2 . 3)))
**** Error:
cadadr: wrong type of argument `3' for car in (cadadr '(1 (2 . 3)))
jpellegrini commented 1 year ago

I see you have included a new instruction. I thought of doing it, but was afraid of adding too many instructions to the VM.

Anyway, the implementation of the CXR functions looks great now! :)

egallesio commented 1 year ago

but was afraid of adding too many instructions to the VM.

Yep. I try to avoid the inclusion of new instructions as far as possible. As I said, I really expected better performances with this new instruction. This was not the case, and I'm a bit disappointed. However, since it permits better error messages, I kept it