Intermediate expression values should not touch the DOM

This is a major one, as it requires significant changes to the way expressions work, so this may stay open for a while and get its milestone deferred several times. Of course, there's always the possibility that there's an easy solution I'm not seeing 😄

Right now, intermediate expression values touch the DOM several times until they resolve. The intermediate values of each cycle of calculation are output to the DOM as if they were final values. Mavo has no concept of a final value vs an intermediate value, all steps of a calculation are seen as equal.

This has the benefit that cycles do not lock up the UI, instead they create an animated infinite loop, making debugging easier. Cycles are cases where a property refers to itself, such as <span property="a">[a]b</span>. This is the result:

mavo-loop

However, there are also several drawbacks of this approach:

It makes it very hard for third-party code to react to expression evaluation because it's impossible to tell when it's stopped (they can only guess via hacks with timers etc). This makes SSR especially hard, because it doesn't know when rendering has finished (cc @betaveros).
It degrades performance, since DOM operations are slow.
It causes more errors or pointless HTTP requests for certain attributes (see #475)

The first two problems can be seen very vividly in this testcase: https://codepen.io/leaverou/pen/ErEeZO This is the worst case scenario: All dependencies are in reverse source order and the dependency chain is of a fairly long length (26), so the computation of z takes 26 cycles to complete. The "history" panel uses a mutation observer to record all intermediate stages that the content of <span property="z"> goes through. Try pressing "Clear" and then changing the value of a. You will notice that this time there is only one change for z: this is because the first time the intermediate stages include expression code (e.g. y + 1 becomes x + 1 + 1 and then w + 1 + 1 + 1 and so on — resulting in NaN because the addition operator only outputs numbers). However, even subsequent times it takes 26 cycles to complete the calculation, because we don't reorder the graph for efficiency based on dependency heuristics (as @karger suggested), but continue to evaluate in source order.

If we complete calculation cycles without DOM output, this will make expression evaluation WAY faster, but will also lock up the UI in cases of loops (which we cannot detect, as we'd be solving the halting problem). I wonder if there's a middle ground here that gives us the benefits without the lockup. I see a few cases:

We keep a dependency graph, based on the identifiers of each expression (which we already extract and use) and show an error if cycles are detected. Given that the expression syntax could even be arbitrary JS, this could have a lot of false positives, the incidence of which is unknown.
We keep track of the number of cycles that a calculation took and cap it at some point. I believe @karger said that Excel used to do this at some point.
Perhaps a combination of the two: If the dependency graph detects a cycle AND the calculation is taking more than N cycles, conclude it's an infinite loop and show an error.

FWIW Google Sheets seems to support both. This is the result with a simple a1=b1 infinite recursion: These are the settings for "Iterative calculation": and when I enable it, A1 becomes 99 and B1 becomes 100. Lots of things we can learn from this!

I think we should optimized both performance and correctness for acyclic dependencies, since those are both the most common and the most inherently correct. In particular, using the dependency graph will ensure that, if the graph is acyclic, each value changes at most once during a recalculation, which should reduce complexity and performance for anything that depends on it.

I'm pessimistic about detecting cycles by any means other than running the computation and seeing something update more than once. The language is so sophisticated (and contains js) such that I doubt we can identify all the weird hacks that could cause on value to depend on another. I especially doubt that just using identifiers will reveal all dependencies. All we can hope to do is notice when a value is different than last time we computed it.

So I'd suggest constructing the dependency graph, topologically sorting it to get an update order, but also periodically looping over it to see if values changed when we didn't expect them to, exposing cyclicity.
We won't necessarily know where the cycle is, but that would tell us to keep looping more aggressively, maybe warn the user as well.

------ Original Message ------ From: "Lea Verou" notifications@github.com To: "mavoweb/mavo" mavo@noreply.github.com Cc: "David Karger" karger@mit.edu; "Mention" mention@noreply.github.com Sent: 2/10/2019 10:02:25 PM Subject: [mavoweb/mavo] Intermediate expression values should not touch the DOM (#476)

This is a major one, as it requires significant changes to the way expressions work, so this may stay open for a while and get its milestone deferred several times. Of course, there's always the possibility that there's an easy solution I'm not seeing 😄

Right now, intermediate expression values touch the DOM several times until they resolve. The intermediate values of each cycle of calculation are output to the DOM as if they were final values. Mavo has no concept of a final value vs an intermediate value, all steps of a calculation are seen as equal.

This has the benefit that cycles do not lock up the UI, instead they create an animated infinite loop, making debugging easier. Cycles are cases where a property refers to itself, such as <span property="a">[a]b. This is the result:

mavo-loop https://user-images.githubusercontent.com/175836/52543670-5ad96480-2d79-11e9-8295-8196df856490.gif

However, there are also several drawbacks of this approach:

It makes it very hard for third-party code to react to expression evaluation because it's impossible to tell when it's stopped (they can only guess via hacks with timers etc). This makes SSR especially hard, because it doesn't know when rendering has finished (cc @betaveros https://github.com/betaveros).It degrades performance, since DOM operations are slow.It causes more errors or pointless HTTP requests for certain attributes (see #475 https://github.com/mavoweb/mavo/issues/475) Both problems can be seen very vividly in this testcase: https://codepen.io/leaverou/pen/ErEeZO This is the worst case scenario: All dependencies are in reverse source order and the dependency chain is of a fairly long length (26), so the computation of z takes 26 cycles to complete. The "history" panel uses a mutation observer to record all intermediate stages that the content of goes through. Try pressing "Clear" and then changing the value of a. You will notice that this time there is only one change for z: this is because the first time the intermediate stages include expression code (e.g. y + 1 becomes x + 1 + 1 and then w + 1 + 1 + 1 and so on — resulting in NaN because the addition operator only outputs numbers). However, even subsequent times it takes 26 cycles to complete the calculation, because we don't reorder the graph for efficiency based on dependency heuristics (as @karger https://github.com/karger suggested), but continue to evaluate in source order.

If we complete calculation cycles without DOM output, this will make expression evaluation WAY faster, but will also lock up the UI in cases of loops (which we cannot detect, as we'd be solving the halting problem). I wonder if there's a middle ground here that gives us the benefits without the lockup. I see a few cases:

We keep a dependency graph, based on the identifiers of each expression (which we already extract and use) and show an error if cycles are detected. Given that the expression syntax could even be arbitrary JS, this could have a lot of false positives, the incidence of which is unknown.We keep track of the number of cycles that a calculation took and cap it at some point. I believe @karger https://github.com/karger said that Excel used to do this at some point.Perhaps a combination of the two: If the dependency graph detects a cycle AND the calculation is taking more than N cycles, conclude it's an infinite loop and show an error. FWIW Google Sheets seems to support both. This is the result with a simple a1=b1 infinite recursion: image https://user-images.githubusercontent.com/175836/52544492-29639780-2d7f-11e9-8639-f6ed7cddc28c.png These are the settings for "Iterative calculation": image https://user-images.githubusercontent.com/175836/52544521-4f893780-2d7f-11e9-8ee7-509401a5831e.png and when I enable it, A1 becomes 99 and B1 becomes 100. Lots of things we can learn from this!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mavoweb/mavo/issues/476, or mute the thread https://github.com/notifications/unsubscribe-auth/ABFpXuSZj861WjNORHCzEToMBdJLxVjbks5vMN1BgaJpZM4azc8s.

mavoweb / mavo

Intermediate expression values should not touch the DOM #476