egallesio / STklos

STklos Scheme
http://stklos.net
GNU General Public License v2.0
69 stars 17 forks source link

Enhancements and optimizations for SRFI-1 #544

Closed jpellegrini closed 1 year ago

jpellegrini commented 1 year ago

Hi @egallesio !

Last one today, I promise :)

There are three commits in this PR:

1. ~Inline~ Make %car+cdrs and friends primitives

The reference implementation of SRFI-1 recommends that the following procedures be inlined:

%cars+, %cdrs, %cars+cdrs, %cars+cdrs+, %cars+cdrs/no-test

~We just did that.~ -- (actually, we made them primitives. We could inline them, but then we'd need to get them into the core, and make the compiler inline even more primitives...)

There is one single C function cars_cdrs that does all the work, and five new internal Scheme primitives.

All procedures in the fold family are faster now.

2. Other general enhancements

3. Optimize fold-right

The single list case has been changed: to avoid exhausting the stack AND to obtain better performance, instead of a recursive procedure we call list->vector and process the vector from the end towards the beginning.

This avoids wrong results due to stack overflow in the following case:

(define L (iota 1_000_000))
(fold - 0 L)

With the default STklos stack size it would overwrite local variables and signal an error. With this patch, it works (because the stack does not grow at all, and all work is done inside a single vector).

It also makes fold-right faster:

(let ((L (iota 1_000))
      (a 0))
  (time
   (repeat  30000
     (set! a (fold-right - 0 L))))
  a)
previous version: 5527.42 ms
new version:      3996.49 ms

We only change the case of one single list -- this optimization would be less efficient for the N-ary cse, since we'd have to compute the minimum size of all lists (or vectors), potentially traversing the larger list (that could perhaps be a million times larger than the others). It would also probably require some more uses of apply.

egallesio commented 1 year ago

Merged this PR. It definitively makes things better (in particular the function which uses call/ec). Thnaks a lot @jpellegrini