carp-lang / Carp

A statically typed lisp, without a GC, for real-time applications.
Apache License 2.0
5.53k stars 178 forks source link

performance: test/macros.carp takes a long time to execute #1215

Open scolsen opened 3 years ago

scolsen commented 3 years ago

Indicates we might have hotspots when dealing with certain functions tested in this file.

scolsen commented 2 years ago

I think I'm starting to encounter this more now that I'm processing a lot of data in dynamic land. In general, functions that are recursively defined quickly degrade in performance as the input grows.

For example, try the following:

;; note there's a PR out to add this to core
(defndynamic faster-to-list [s]
   (cdr (String.split-on "" s))) 

(defndynamic read-source [f]
    (Dynamic.read-file f))

 (defndynamic strip-whitespace [string]
    (filter (fn [x] (not (= "\n" x))) (faster-to-list string)))

If you run (strip-whitespace (read-source "/Users/scottolsen/dev/Carp/core/Binary.carp")) you'll find that it takes quite a while to complete. You may also find it leads to space leaks and slows down the REPL on the whole. Redefining filter to be more like haskell's definition and avoid calling append and reduce (extra layers of recursion), eliminates the space leak problems but still runs really slowly.

My guess is this has to do with the fact that the evaluator proceeds recursively, so every time we have a recursive function we're essentially building up a mega-large thunk in Haskell, each of which contains lookup calls etc. which leads to the slowdown.

scolsen commented 2 years ago

The situation doesn't change when re-writing filter to be iterative instead of recursive:

(defndynamic filter [p xs]
     (if (empty? xs)
         xs  
         (let [filter-result (list)
               items xs]
           (do
           (while (not (empty? (cdr xs)))
             (do (if (p (car xs))
                     (set! filter-result (cons (car xs) filter-result))
                     ())
                 (set! items (cdr xs))))
             filter-result))))

which further makes me think it has something to do with the underlying haskell code building up structures that take a long time to resolve (super thunks).

eriksvedang commented 2 years ago

Interesting findings! We definitely need to get to the bottom of this to enable a nice developer experience.

scolsen commented 2 years ago

I tried a couple things the other night but didn't really get anywhere.

Carp has strict semantics; given (foo bar) we always evaluate bar before proceeding to evaluate the call (and return an error if bar is not evaluable) strictness ::= f undefined = undefined.

I'm guessing the thunk build up stems from the fact that eval is not strict in certain arguments, but it's difficult to tell with all the branching. apply is also clearly not strict in any of its arguments afaict...so that could be where the thunk buildup is occurring (actually, that would make more sense to me, since there's little issue evaluating long lists, which would be equally problematic (presumably) if the root eval loop was the issue, since evaluating a list entails evaling all its members).

scolsen commented 2 years ago

I'm getting a bit closer to understanding the problem here; basically, there are two issues in combination:

  1. The real expense is not evaluation, but rather lookups. When we have to traverse the entire environment for a symbol it's costly.
  2. We redo tons of work in recursive functions! When we evaluate a recursive function, we evaluate and lookup every symbol in the function body N times, where N is the number of recursions! Instead, we should do the following:
    • process function bodies in two steps:
      • first, lookup all the non-argument symbols in the body. evaluate and resolve them all once. Build a "cache" envrionment that points to all these symbols.
      • second, evaluate the function, with arguments passed, using the cache to perform all subsequent lookups.

This should make it so that, on recursive calls, we aren't traversing (potentially) the entire environment multiple times to lookup an already resolved value.

There is one potential pitfall with this approach: global side effects. set! means the binding to the name in the cache env could change. How will we handle this? I think we need to just have set! operate on the cache environment, then, at the end of function evaluation, if there's a diff between the previous bindings and the cache bindings, set the previous bindings to the values in the cache.

In theory this will solve the performance issue discussed above, but it may take a second to get it right.

scolsen commented 2 years ago

The hardest part will be decoupling function application from eval. Presently we go through recursively till we have nothing more to evaluate, instead, we'll need a sort of "eval once" mechanism to build the env cache (other wise we'd just recur while building up the cache)