~We just did that.~ -- (actually, we made them primitives. We could inline them, but then we'd need to get them into the core, and make the compiler inline even more primitives...)
There is one single C function cars_cdrs that does all the work, and five new internal Scheme primitives.
All procedures in the fold family are faster now.
2. Other general enhancements
In redefine, get the standard procedures from module SCHEME, not STklos.
Don't define proper-list?, just make it an alias to list?, since they do the same thing, and list? is a fast primitive already in STklos core.
3. Optimize fold-right
The single list case has been changed: to avoid exhausting the stack AND to obtain better performance, instead of a recursive procedure we call list->vector and process the vector from the end towards the beginning.
This avoids wrong results due to stack overflow in the following case:
(define L (iota 1_000_000))
(fold - 0 L)
With the default STklos stack size it would overwrite local variables and signal an error. With this patch, it works (because the stack does not grow at all, and all work is done inside a single vector).
It also makes fold-right faster:
(let ((L (iota 1_000))
(a 0))
(time
(repeat 30000
(set! a (fold-right - 0 L))))
a)
previous version: 5527.42 ms
new version: 3996.49 ms
We only change the case of one single list -- this optimization would be less efficient for the N-ary cse, since we'd have to compute the minimum size of all lists (or vectors), potentially traversing the larger list (that could perhaps be a million times larger than the others). It would also probably require some more uses of apply.
Hi @egallesio !
Last one today, I promise :)
There are three commits in this PR:
1. ~Inline~ Make
%car+cdrs
and friends primitivesThe reference implementation of SRFI-1 recommends that the following procedures be inlined:
~We just did that.~ -- (actually, we made them primitives. We could inline them, but then we'd need to get them into the core, and make the compiler inline even more primitives...)
There is one single C function
cars_cdrs
that does all the work, and five new internal Scheme primitives.All procedures in the
fold
family are faster now.2. Other general enhancements
redefine
, get the standard procedures from moduleSCHEME
, notSTklos
.proper-list?
, just make it an alias tolist?
, since they do the same thing, andlist?
is a fast primitive already in STklos core.3. Optimize
fold-right
The single list case has been changed: to avoid exhausting the stack AND to obtain better performance, instead of a recursive procedure we call list->vector and process the vector from the end towards the beginning.
This avoids wrong results due to stack overflow in the following case:
With the default STklos stack size it would overwrite local variables and signal an error. With this patch, it works (because the stack does not grow at all, and all work is done inside a single vector).
It also makes
fold-right
faster:We only change the case of one single list -- this optimization would be less efficient for the N-ary cse, since we'd have to compute the minimum size of all lists (or vectors), potentially traversing the larger list (that could perhaps be a million times larger than the others). It would also probably require some more uses of apply.