Pattern matching factorization optimization

tizoc commented 5 years ago

I'm thinking that it may be doable in Javascript, turns out there is a feature that lets you label blocks of code (and loops) and use break with those labels: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/label#Using_a_labeled_block_with_break

Probably not a priority at the moment, but something to consider in the future.

rkoeninger commented 5 years ago

@tizoc Your PR for shen-scheme doesn't mention any performance testing? What was the impact of this optimization? (startup time, test suite time, custom benchmarks...)

tizoc commented 5 years ago

Startup, tests, etc didn't get affected much, the big impact is on specific individual functions that are heavy on pattern matching of deep structures. Most programs will not benefit much from this optimization, but if a program happens to have some loop with one of such functions being called a lot, it is going to benefit for sure.

I didn't do much performance testing other than what I mentioned on the mailing list:

Pasting the results here:

Functions:

(define simplify
  [Op A B] -> (s Op (simplify A) (simplify B))
  A -> A)

(define s
  + M N -> (+ M N)      where (and (number? M) (number? N))
  + 0 F -> F
  + F 0 -> F
  + A [+ B C] -> (simplify [+ [+ A B] C])
  * M N -> (* M N)    where (and (number? M) (number? N))
  * 0 F -> 0
  * F 0 -> 0
  * F 1 -> F
  * 1 F -> F
  * A [* B C] -> (simplify [* [* A B] C])
  Op A B -> [Op A B])

(define neals-function
  [1 2 3 1 2 3 1 2 3 1 2 3 | B] [[[1 | 2] | 3] [1 | 2] [[1 | 2] | 3] [1 | 2] | B] -> 0
  [X|X] [X|X] -> (+ X 1000))

optimized neals-function
    no collections
    8.424246713s elapsed cpu time
    8.433337000s elapsed real time
    32 bytes allocated

unoptimized neals-function
    no collections
    15.078042225s elapsed cpu time
    15.095092000s elapsed real time
    32 bytes allocated

optimized simplify
    1025 collections
    8.814395178s elapsed cpu time, including 0.043543484s collecting
    8.824976000s elapsed real time, including 0.047677000s collecting
    8640656016 bytes allocated, including 8637151872 bytes reclaimed

unoptimized simplify
    1025 collections
    10.267284990s elapsed cpu time, including 0.042916747s collecting
    10.287693000s elapsed real time, including 0.046882000s collecting
    8640656016 bytes allocated, including 8640178800 bytes reclaimed

I wouldn't spend time on this now if I were you, wait at least to until after I make a generic version as an extension (if I'm able to), will save you a lot of work.

tizoc commented 4 years ago

@rkoeninger any pointers to where I should start looking at if I wanted to give this a try?

I think labelled blocks make this possible, but I want to be sure. If it is not possible I may have to try implementing an alternative factorization strategy for platforms that cannot use the current one (would probably not perform the full optimization, just part of it).

rkoeninger commented 4 years ago

I never had a working example of goto-like code in JS so that's the first hurdle. You can break or continue to labelled statements but I don't know what the scope of those labels is. Maybe using a do { } while(false) would allow a scope that can use continue-to-labels while being naturally non-repeating. I don't know, you'll have to play with that.

Here's the root build function which identifies special forms, you'd probably add a isSwitch function to identify when we're building a switch-case style construct. (Maybe isFactoredDefun would be a better name.)

https://github.com/rkoeninger/ShenScript/blob/0e990504139b08d2a44816ead38796d49619b3a7/lib/backend.js#L344-L350

There are functions in ast.js that help with building js-ast's so start with those, I don't know if there's functions for the new forms of syntax you'd be using, so you might have to add a few like Break and Label. I can help with that.

tizoc commented 4 years ago

I never had a working example of goto-like code in JS so that's the first hurdle.

Fair enough. I will try to produce a code example in JS that does what I mean and come back to this issue to post it. But let me give you a rough idea of how in my mind it would work.

No do/while would be used because labels support plain blocks. In terms of how code is generated, Javascript's result would be more similar to Common Lisp (label jumps with arguments not being used) rather than Scheme (tail calls to functions, passing everything it references from the scope as arguments).

(%%let-label (<label> ...<args>) <label-body> <body>)

should produce something like

<label>: {
   <body>
}
<label-body>

Which is similar to what the Common Lisp port generates but inverted (the because the label body is outside of the body and the non-label body outside, a "goto" here gets out of the block).

In Common Lisp the result is (tagbody <body> <label> <label-body>), in Javascript <label>: { <body> } <label-body>.

Then (%%goto-label <label> ...<args>)

should produce something like

break <label>;

and (%%return <expr>)

should produce:

return <expr>;

Like in the case Common Lisp, labelled blocks will be nested, there is always a single immediate outer label, which can be represented easily with labelled blocks in Javascript.

tizoc commented 4 years ago

@rkoeninger when inside the ShenScript REPL, how can I produce Javascript code from a Kl input?

tizoc commented 4 years ago

Found it, js.ast.compile

tizoc commented 4 years ago

Ouch, I wanted to use the generated code as a starting point and adjust it to use labelled breaks, but the output code doesn't look at all like what I was expecting.

Does every expression have to be awaited? or is there something I'm doing wrong when generating the JS code?

I'm doing: (js.ast.render (js.ast.compile (value *code*)))

tizoc commented 4 years ago

Ok, I think I understand what is going on. Have to go now, will continue working on this later.

rkoeninger commented 4 years ago

Ouch, I wanted to use the generated code as a starting point and adjust it to use labelled breaks, but the output code doesn't look at all like what I was expecting.

The generated code is very expression-oriented (including converting lets into lambdas) which might clash badly with the statement-oriented nature of this goto pattern. Considering this optimization you're trying to implement only targets the root-level of defuns (I think?) you can add a special case or an alternate version of the buildDefun function and keep each conditional and consequent (and all other build* functions) in fundamentally expression-oriented form.

I don't want to have to bring back the statement-composing fabr concepts.

Does every expression have to be awaited? or is there something I'm doing wrong when generating the JS code?

Every function call that's not certain to not return a promise must be awaited. So that's most function calls. And if a there are any awaits in a function, it needs to be async. That's most functions.

I'm doing: (js.ast.render (js.ast.compile (value *code*)))

Yeah, that's right. The generated code is just highly idiomatic in style.

tizoc commented 4 years ago

Here is the code I'm trying to compile, what I don't understand is why it is generating all those awaits, nothing here is async:

[defun example [V7 V8]
  [%%let-label %%label15 [%%return [shen.f_error example]]
    [if [cons? V7]
        [let V7/hd [hd V7]
          [let V7/tl [tl V7]
            [%%let-label %%label16 [if [and [= 2 V7/hd]
                                           [cons? V7/tl]]
                                       [%%return [hd V7/tl]]
                                       [%%goto-label %%label15]]
              [if [and [= 1 V7/hd] [cons? V7/tl]]
                  [if [= 1 V8]
                      [%%return [hd V7/tl]]
                      [if [= 2 V8]
                          [%%return [tl V7/tl]]
                          [%%goto-label %%label16]]]
                  [%%goto-label %%label16]]]]]
        [%%goto-label %%label15]]]]

I defined dummy functions for %%goto-label, %%return and %%let-label to make sure that the compiler know what they were and didn't assume they are async to be safe. Is there anything else I should be doing?

Now that I saw the generated code I'm not sure if labelled breaks are viable, because new functions are being introduced, and I don't know if jumps from inside the function to outer blocks or if it is efficient. Doesn't this add a lot of overhead in general?

rkoeninger commented 4 years ago

Can you post the generated JS here? Have you made any changes to the backend?

because new functions are being introduced, and I don't know if jumps from inside the function to outer blocks or if it is efficient

Oh, you mean the IIFE (Immediately Invoked Function Expression) pattern like how (let X Y Z) becomes (X => Z)(Y). I tried a long experiment to replace IIFE's with more conventional syntax and it was acutally a lot slower and generated much more code. IIFEs are very common in JS so JS engines are probably really good at optimizing them.

I think you're right that you can't break from inside an IIFE to surrounding context. It wouldn't be valid JS. Maybe you could have the IIFE return early with some kind of symbol and then the IIFE is surrounded in code that checks for the return value and breaks accordingly... this is getting complicated fast.

tizoc commented 4 years ago

Here is the generated code (~sorry for the format, the pretty printers I have tried don't seem to be able to process it~ Edit: semi-formatted it by hand and then prettifiers could at least indent it)

await (async ($25$25let$2dlabel$c, $25$25label15$s, $25$25return$c, error$c, example$s, $25$25label16$s, $25$25goto$2dlabel$c) =>
    $.d("example", $.l(async (V7, V8) =>
        $.b($25$25let$2dlabel$c.f, $25$25label15$s, await $.t($25$25return$c.f(await $.t(error$c.f(example$s)))), $.isCons(V7) ?
            await (async V7$2fhd =>
                await (async V7$2ftl =>
                    await $.t($25$25let$2dlabel$c.f($25$25label16$s, 2 === V7$2fhd && $.isCons(V7$2ftl) ?
                        await $.t($25$25return$c.f($.asCons(V7$2ftl).head)) :
                        await $.t($25$25goto$2dlabel$c.f($25$25label15$s)), 1 === V7$2fhd && $.isCons(V7$2ftl) ?
                        1 === V8 ?
                        await $.t($25$25return$c.f($.asCons(V7$2ftl).head)) :
                        2 === V8 ?
                        await $.t($25$25return$c.f($.asCons(V7$2ftl).tail)) :
                        await $.t($25$25goto$2dlabel$c.f($25$25label16$s)) :
                        await $.t($25$25goto$2dlabel$c.f($25$25label16$s)))))(V7.tail))(V7.head) :
            await $.t($25$25goto$2dlabel$c.f($25$25label15$s)))))
    )($.c("%%let-label"), $.s `%%label15`, $.c("%%return"), $.c("error"), $.s `example`, $.s `%%label16`, $.c("%%goto-label"))

Edit: roughly, this is what I had in my mind:

Shen function:

(define example
  [1 X | Xs] 1 -> X
  [1 X | Xs] 2 -> Xs
  [2 X | Xs] _ -> X)

KLambda for that code

(defun example (V1345 V1346)
 (cond
  ((and (cons? V1345)
        (and (= 1 (hd V1345)) (and (cons? (tl V1345)) (= 1 V1346))))
   (hd (tl V1345)))
  ((and (cons? V1345)
        (and (= 1 (hd V1345)) (and (cons? (tl V1345)) (= 2 V1346))))
   (tl (tl V1345)))
  ((and (cons? V1345) (and (= 2 (hd V1345)) (cons? (tl V1345))))
   (hd (tl V1345)))
  (true (shen.f_error example))))

After factorization

(defun example (V1345 V1346)
 (%%let-label (%%label1347) (%%return (shen.f_error example))
  (if (cons? V1345)
      (let V1345/hd (hd V1345)
        (let V1345/tl (tl V1345)
          (%%let-label (%%label1348 V1345/hd V1345/tl)
           (if (and (= 2 V1345/hd) (cons? V1345/tl))
               (%%return (hd V1345/tl))
               (%%goto-label %%label1347))
           (if (and (= 1 V1345/hd) (cons? V1345/tl))
               (if (= 1 V1346)
                   (%%return (hd V1345/tl))
                   (if (= 2 V1346)
                       (%%return (tl V1345/tl))
                       (%%goto-label %%label1348 V1345/hd V1345/tl)))
               (%%goto-label %%label1348 V1345/hd V1345/tl)))))
      (%%goto-label %%label1347))))

Common Lisp port generates:

(DEFUN example (V1345 V1346)
  (BLOCK NIL
    (TAGBODY
      (IF (CONSP V1345)
          (LET ((V1345/hd (CAR V1345)))
            (LET ((V1345/tl (CDR V1345)))
              (TAGBODY
                (IF (AND (EQL V1345/hd 1) (CONSP V1345/tl))
                    (IF (EQL V1346 1)
                        (RETURN (CAR V1345/tl))
                        (IF (EQL V1346 2)
                            (RETURN (CDR V1345/tl))
                            (GO %%label1348)))
                    (GO %%label1348))
               %%label1348
                (IF (AND (EQL V1345/hd 2) (CONSP V1345/tl))
                    (RETURN (CAR V1345/tl))
                    (GO %%label1347)))))
          (GO %%label1347))
     %%label1347
      (RETURN (shen.f_error 'example)))))

Equivalent Javascript would be something like:

function example(V1345, V1346) {
    __label1347: {
        if (consp(V1345)) {
            const V1345_hd = hd(V1345);
            const V1345_tl = tl(V1345);
            __label1348: {
                if (V1345_hd === 1 && consp(V1345_tl)) {
                    if (V1346 === 1) {
                        return hd(V1345_tl);
                    } else {
                        if (V1346 === 2) {
                            return tl(V1345_tl);
                        } else {
                            break __label1348;
                        }
                    }
                } else {
                    break __label1348;
                }
            }
            if (V1345_hd == 2 && consp(V1345_tl)) {
                return hd(V1345_tl);
            } else {
                break __label1347;
            }
        } else {
            break __label1347;
        }
    }
    return shen.f_error(symbol`example`);
}

tizoc commented 4 years ago

Ah, and no changes to the backend, I'm using ShenScript as-is

rkoeninger commented 4 years ago

Formatted with some inlining:

$.d("example", $.l(async (V7, V8) =>
  $.b($.f("%%let-label"),
    $.s`%%label15`,
    await $.t($.f("%%return")(await $.t($.f("error")($.s`example`)))),
    $.isCons(V7) ?
      await $.t($.f("%%let-label")(
        $.s`%%label16`,
        2 === V7.head && $.isCons(V7.tail) ?
          await $.t($.f("%%return")(V7.tail.head)) :
          await $.t($.f("%%goto-label")($.s`%%label15`)),
        1 === V7.head && $.isCons(V7.tail) ?
          1 === V8 ? await $.t($.f("%%return")(V7.tail.head)) :
          2 === V8 ? await $.t($.f("%%return")(V7.tail.tail)) :
                     await $.t($.f("%%goto-label")($.s`%%label16`)) :
        await $.t($.f("%%goto-label")($.s`%%label16`)))) :
      await $.t($.f("%%goto-label")($.s`%%label15`)))))

You're not going to be able to do any meaningful manipulation of the estree (js ast). %%let-label and the others will have to be treated as special forms. Special forms are handled by the build* functions.

But you'd also have to bring back statements to the Fabrication class which by my last experiment would more than outweigh the benefits of factorization.

tizoc commented 4 years ago

Yes, and it is not just about adding statement. The code generation I had in mind is very different to what ShenScript does now, stuff like compiling lets to this instead using IIFE to handle bindings:

{ const NewVar = Expression; 
  { const OtherVar = OtherExpression; 
    return SomeResult; } }

which is a big change, because this is not expression-oriented anymore and some analysis would have to be done to decide where to put returns and when there shouldn't be any (when the result of the let is to be bound to something and not the final result of the function for example). This complicates ifs too, because the ternary operator is expression-oriented but doesn't allow for blocks or statements inside it, and regular JS if has the opposite problem and some special care is needed to handle cases where it is used as an expression.

Regarding IIFE, I don't know how good Javascript runtimes are at optimizing them, I guess they do a good job because it doesn't seem to be particularly hard, and as you mention, it is a common pattern. What I'm not that sure about is all those async/awaits that get introduced because of it when the code cannot be proven to be sync. Not only it seems harder to optimize for, regular Javascript code itself is not that heavy on async/await, and because of that I don't know how much JS runtimes focus on it.

Anyway, thats a whole different discussion, and I don't know if you are interested on it, so going back to the original topic: given how code is generated now, the labelled breaks approach is not possible, what would work instead is using continuations, so that each labelled block is a function that gets either passed to the scope (bound to name of the label) or defined at the root of the function definition, %%goto-label a call to that function, and (%%return Exp) just Exp being that the generated code is expression oriented. This is more like Shen/Scheme's approach than Shen/CL's.

rkoeninger commented 4 years ago

I really don't feel like re-writing the compiler right now, which this would require. I don't think factorization is going to be supported by ShenScript as previous experience with attempting such a re-write was so bad and ultimately counterproductive (slower, more code volume).

tizoc commented 4 years ago

I really don't feel like re-writing the compiler right now, which this would require.

You don't have to! everything I mentioned about the compilation strategy is about a specific way of implementing factorization (the one I mentioned on my first post), and that is what isn't doable with the current compilation strategy. The alternative one (that fits the current compilation strategy) is to use functions for jumps, like I do in Shen/Scheme. I don't think Javascript runtimes will optimize it into gotos like Chez does, but whatever overhead that may be added by that strategy (compared to labelled break) exists already, so it is not an issue.

The way this will be compiled is:

(%%return <exp>) -> <exp> (nothing has to be done, because the code is expression oriented)
(%%let-label (<label> <label-args>) <label-body> <body>) -> bind <label> to (<label-args>) => <label-body> (if possible, without adding it to <body>'s as far as KLambda goes, but Javascript should see that binding).
(%%goto-label <label> <label-args>) -> just invoke que function bound to <label> passing it <label-args>

I only need to figure out how to do a few things, which now I kind of know how to do after going through the build* functions:

Build a javascript function that takes N arguments (that is, single function that takes all arguments, not nested functions)
A way to invoke this function directly (that is, uncurried, by passing all arguments directly).
When binding that function, do so in a way that doesn't extend the scope (I'm not 100% sure if this is necessary)

tizoc commented 4 years ago

Edit: ignore this, I figure it out

I added this:

  isForm(expr, 0, '%%return')     ? buildReturn   (context, expr) :
  isForm(expr, 0, '%%goto-label') ? buildGotoLabel(context, expr) :
  isForm(expr, 0, '%%let-label')  ? buildLetLabel (context, expr) :

and

const buildReturn = (context, [_return, expr]) => {
  return build(context, expr);
};
const buildGotoLabel = (context, [_goto, ...call]) => {
  return buildApp(context, call);
};
const buildLetLabel = (context, [_letLabel, labelForm, labelBody, body]) => {
  const [label, ...labelParams] = labelForm;
  const labelFunc = buildFunction(context, labelParams, labelBody);
  return buildLet(context, ['let', label, labelFunc, body]);
};

but somewhere it is failing to compile the following function:

(set *code* (read-from-string "(defun example (V1345 V1346)
 (%%let-label (%%label1347) (%%return (shen.f_error example))
  (if (cons? V1345)
      (let V1345/hd (hd V1345)
        (let V1345/tl (tl V1345)
          (%%let-label (%%label1348 V1345/hd V1345/tl)
           (if (and (= 2 V1345/hd) (cons? V1345/tl))
               (%%return (hd V1345/tl))
               (%%goto-label %%label1347))
           (if (and (= 1 V1345/hd) (cons? V1345/tl))
               (if (= 1 V1346)
                   (%%return (hd V1345/tl))
                   (if (= 2 V1346)
                       (%%return (tl V1345/tl))
                       (%%goto-label %%label1348 V1345/hd V1345/tl)))
               (%%goto-label %%label1348 V1345/hd V1345/tl)))))
      (%%goto-label %%label1347))))"))

(define show-js -> (js.ast.render (js.ast.compile (hd (value *code*)))))

(show-js)

The error I get is (modified to include the expr object):

not a valid form: Fabrication {
  ast: {
    type: 'CallExpression',
    callee: {
      type: 'MemberExpression',
      object: { type: 'Identifier', name: '$' },
      property: { type: 'Identifier', name: 'l' },
      computed: false
    },
    arguments: [
      {
        type: 'ArrowFunctionExpression',
        async: true,
        params: [],
        body: [ Symbol(shen.f_error), Symbol(example) ]
      }
    ]
  },
  subs: {}
}

@rkoeninger any idea of what am I missing?

tizoc commented 4 years ago

Ignore my question, its fixed.

tizoc commented 4 years ago

What I have right now is this:

const buildReturn = (context, [_return, expr]) => {
  return build(context, expr);
};
// (%%goto-label f ...args) -> f(...args), with no lookup being done 
// for f, has to be a direct javascript call
const buildGotoLabel = (context, [_goto, f, ...args]) => {
  return Call(SafeId(nameOf(f)), args, context.async);
};
const buildLetLabel = (context, [_letLabel, labelForm, labelBody, body]) => {
  const [label, ...labelParams] = labelForm;
  const labelFunc = buildFunction(context, labelParams, labelBody);
  return assemble(
    (y, z) => Call(Arrow([SafeId(nameOf(label))], z, context.async), [y], context.async),
    uncast(labelFunc),
    uncast(build(context, body)));
};

but it fails with this error, and I don't know from where it is coming from:

generator[nodes[0].type] is not a function

tizoc commented 4 years ago

Ok, fixed, not having a typechecker makes me dizzy:

const buildReturn = (context, [_return, expr]) => {
  return build(context, expr);
};
const buildGotoLabel = (context, [_goto, f, ...args]) => {
  return Call(SafeId(nameOf(f)), args.map(s => SafeId(nameOf(s))), context.async);
};
const buildLetLabel = (context, [_letLabel, labelForm, labelBody, body]) => {
  const [label, ...labelParams] = labelForm;
  const labelFunc = buildFunction(context, labelParams, labelBody);
  return assemble(
    (y, z) => Call(Arrow([SafeId(nameOf(label))], z, context.async), [y], context.async),
    uncast(labelFunc),
    uncast(build(context, body)));
};

The code generated now looks like this:

await (async (shen$2ef_error$c, example$s) =>
    $.d("example", $.l(async (V1345, V1346) =>
        await (async $25$25label1347 =>
            $.isCons(V1345) ?
            await (async V1345$2fhd =>
                await (async V1345$2ftl =>
                    await (async $25$25label1348 =>
                        1 === V1345$2fhd && $.isCons(V1345$2ftl) ?
                        1 === V1346 ?
                        $.asCons(V1345$2ftl).head :
                        2 === V1346 ?
                        $.asCons(V1345$2ftl).tail :
                        await $25$25label1348(V1345$2fhd, V1345$2ftl) :
                        await $25$25label1348(V1345$2fhd, V1345$2ftl))($.l(async (V1345$2fhd, V1345$2ftl) =>
                        2 === V1345$2fhd && $.isCons(V1345$2ftl) ?
                        $.asCons(V1345$2ftl).head :
                        await $25$25label1347())))(V1345.tail))(V1345.head) :
            await $25$25label1347())($.l(async () =>
            $.b(shen$2ef_error$c.f, example$s))))))($.c("shen.f_error"), $.s `example`)

I still have to enable the extension and give it a try to see that all tests still pass, but I think the hardest part is solved already.

tizoc commented 4 years ago

Ok, bad news, the overhead of the introduced intermediary function definitions and calls is bigger than the savings (V8 isn't smart enough to optimize it away) for normal cases, only very extreme cases would benefit from this (but couldn't try those).

It seems the Kl->Js compiler has a hard time compiling the following function:

(define neals-function
  [1 2 3 1 2 3 1 2 3 1 2 3 | B] [[[1 | 2] | 3] [1 | 2] [[1 | 2] | 3] [1 | 2] | B] -> 0
  [X|X] [X|X] -> (+ X 1000))

It uses 100% of my cpu and never finishes (either with or without my modifications).

tizoc commented 4 years ago

Made a few changes so that labels aren't counted as external calls, and neither is shen.f_error, so that the generated function isn't async, but that didn't help much. This is the code generated by the new version:

await (async (shen$2ef_error$c, example$s) =>
    $.d("example", $.l((V1345, V1346) =>
        ($25$25label1347 =>
            $.isCons(V1345) ?
            (V1345$2fhd =>
                (V1345$2ftl =>
                    ($25$25label1348 =>
                        1 === V1345$2fhd && $.isCons(V1345$2ftl) ?
                        1 === V1346 ?
                        $.asCons(V1345$2ftl).head :
                        2 === V1346 ?
                        $.asCons(V1345$2ftl).tail :
                        $25$25label1348(V1345$2fhd, V1345$2ftl) :
                        $25$25label1348(V1345$2fhd, V1345$2ftl))(
                        $.l((V1345$2fhd, V1345$2ftl) =>
                            2 === V1345$2fhd && $.isCons(V1345$2ftl) ?
                            $.asCons(V1345$2ftl).head :
                            $25$25label1347())))(V1345.tail))(V1345.head) :
            $25$25label1347())(
            $.l(() =>
                $.b(shen$2ef_error$c.f, example$s))))))($.c("shen.f_error"), $.s `example`)

Another thing to try would be to hoist those label functions, but I don't think it will make a difference.

Edit:

Here is the test btw, unoptimized version takes about 1.4s and optimized 2.7s:

(define factorize Name -> (eval-kl (shen.x.factorise-defun.factorise-defun (ps Name))))

(define do-times
  0 F -> done
  N F -> (do (F void)
             (do-times (- N 1) F)))

(define simplify
  [Op A B] -> (s Op (simplify A) (simplify B))
  A -> A)

(define s
  + M N -> (+ M N)      where (and (number? M) (number? N))
  + 0 F -> F
  + F 0 -> F
  + A [+ B C] -> (simplify [+ [+ A B] C])
  * M N -> (* M N)    where (and (number? M) (number? N))
  * 0 F -> 0
  * F 0 -> 0
  * F 1 -> F
  * 1 F -> F
  * A [* B C] -> (simplify [* [* A B] C])
  Op A B -> [Op A B])

(output "simplify regular ~%")
  (let Exp [* x [+ [+ [* 12 0] [+ 23 8]] y]]
    (time (do-times 200000 (/. _ (simplify Exp)))))

(factorize simplify)
(factorize s)

(output "simplify optimized ~%")
  (let Exp [* x [+ [+ [* 12 0] [+ 23 8]] y]]
    (time (do-times 200000 (/. _ (simplify Exp)))))

rkoeninger commented 4 years ago

Thanks for figuring all that out. I see you got familiar with the codebase very quickly - I didn't have time to answer your questions before you found them yourself. Your approach is nice and neat also.

What's the final verdict here? It's slower with this optimization added? Maybe it would be faster with statement-oriented syntax, but that would be the significant re-work ("re-write") I referred to.

Factorization seems like a very specific fine-tuning at this time. ShenScript has bigger issues, like an over-abundance of async/await. The introduction of async/await slows it down by a factor of 4-8. Improving that is really the top performance priority.

tizoc commented 4 years ago

It was quite easy once a few things about the design and implementation "clicked", the code is clear and very well organized.

The verdict is that, at this time it is not worth it. I have a few ideas I may try later regarding expressions vs statements -- the short version is that the context includes a flag that lets the inner expression know how its result will be used, along with a name to which it will be bound to if any, but it is still very fuzzy in my mind, so don't worry about it, I'm not even sure yet if it is doable.

Regarding async/await, I noticed that other than a static list, you don't keep a record of which functions are known to not be async. Keeping a record could help by removing a lot of asyncs, but the order on which functions are defined will affect what you recognize as sync.

I'm not entirely sure yet about how much overhead the async/await stuff has (in addition to the intermediary functions already there) based on the tests I pasted on that comment above, because the same function (either factorized or regular), took mostly the same amount of time to run in both sync and async versions, which I found quite surprising.

You can consider this issue obsolete and close it, the topic can be revisited later.

rkoeninger commented 4 years ago

Thanks for trying this out. It was worth a shot.

Regarding async/await, I noticed that other than a static list, you don't keep a record of which functions are known to not be async. Keeping a record could help by removing a lot of asyncs, but the order on which functions are defined will affect what you recognize as sync.

It's limited to a static list because user-defined functions can be redefined. I had ideas about how to improve this - if you want to talk about async/await optimization, we can do that in another issue.

tizoc commented 4 years ago

One of the things I have on my queue is a program build tool for Shen/Scheme that assume that the program being compiled is not dynamic (not running on the REPL and not using eval), and take advantage of that (both in the Klambda->Scheme compiler, and then Chez itself). Once it is working it should be generalizable to other ports too. Function redefinition is something the compiler would not have to worry about when being compiled that way.

tizoc commented 4 years ago

Update on this one, I made a translation of the factorized example above:

(define example
  [1 X | Xs] 1 -> X
  [1 X | Xs] 2 -> Xs
  [2 X | Xs] _ -> X)

to a version that uses labelled breaks:

(((shen$2ef_error$c, example$s) =>
    $.d('example', $.l((V1345, V1346) => {
        $25$25label1347: {
            if ($.isCons(V1345)) {
                const V1345$2fhd = V1345.head;
                const V1345$2ftl = V1345.tail;
                $25$25label1348: {
                    if (1 === V1345$2fhd && $.isCons(V1345$2ftl)) {
                        if (1 === V1346) {
                            return $.asCons(V1345$2ftl).head;
                        } else if (2 === V1346) {
                            return $.asCons(V1345$2ftl).tail;
                        } else {
                            break $25$25label1348;
                        }
                    } else {
                        break $25$25label1348;
                    }
                }
                if (2 === V1345$2fhd && $.isCons(V1345$2ftl)) {
                    return $.asCons(V1345$2ftl).head;
                } else {
                    break $25$25label1347;
                }
            }
        }
        return $.b(shen$2ef_error$c.f, example$s);
    })))($.c('shen.f_error'), $.s `example`))

Using this test:

(let L [2 3 5]
  (time (do-times 2000000 (/. _ (example L 3)))))

Times:

Regular: run time: 2.065999984741211 secs
Factorized with functions for lets: run time: 2.7860000133514404 secs
Factorized with labelled breaks and const for lets: run time: 1.7890000343322754 secs

rkoeninger / ShenScript

Pattern matching factorization optimization #13