[performance] autorun with delay deriving state unnecessarily

realyze commented 1 year ago

When autorun uses a delay, we compute derived state eagerly when dependent state is invalidated, i.e. before the autorun actually runs. It's unclear why we need to (or should) do that since computing the dependent state early means there is an opportunity for the state to be mutated yet again before autorun runs which means we potentially need to recompute one more time when autorun runs.

Steps:

Create an autorun with delay 1000 ms that observes a computed c which depends on observable o with value 0
At 1500ms set o to 1, this will trigger a recomputation of c (but why does it need to? We're still 500ms from autorun reading that value)
At 1750ms set o to 2, this won't trigger a recomputation of c but it means we'll need to recompute at 2000ms when autorun runs
At 2000ms, autorun will run and recompute (with o === 2)

Here's a minimal reproducible scenario: https://codesandbox.io/s/mobx-autorun-recalculation-issue-z3vggt

urugator commented 1 year ago

upsuper commented 9 months ago

I think this is really a question of semantics, or expectation of the behavior. More specifically, do you expect the autorun to be triggered if the computation result is not changed even though its dependency is changed?

For example, in the testcase you provided, if we make the computed do Math.round(o.get()) rather than returning the value directly, and set it to 0.1 at 1500ms, and 0.9 at 1600ms, would you expect the autorun to be triggered at 2500ms (triggered by setting o to 0.1 then delay by 1000ms) or at 2600ms (triggered by computation result changing to 1 then delay by 1000ms)? I would probably expect the latter that the autorun is triggered only when its direct dependency is actually changing. And to do that, a eager recomputation would be necessary.

upsuper commented 9 months ago

I'd note, though, that the autorun should never be run at 2000ms as you indicated, because there is nothing triggering it at 1000ms. There is probably some misunderstanding on how delay works.

realyze commented 9 months ago

Perhaps the issue is better visible with a reaction rather than an autorun. See e.g. here.

There's a reaction with a 100ms delay and being a reaction, we never ought to fire its side-effect fn if the expr function result did not change.

Now, from the console log it's clear we recalculate the value of comp twice within the given delay interval. That should not be necessary. I think we ought to be able to only invalidate it (which is cheap) and delay its recalculation until it's actually read by the reaction.

realyze commented 9 months ago

My mental model of delay is basically that while we're waiting on delay, we should only ever invalidate dependencies (i.e., mark them as dirty) but never recalculate them. Recalculation should only happen once per delay interval when delay elapses (and that's when we decide whether either the autorun or the side-effect fn of a reaction ought to run).

upsuper commented 9 months ago

I think that mental model mismatches the implementation, and possibly the intention here.

In my understanding, delay is the interval between when the autorun (or side effect of reaction) is triggered and when it is actually executed. But when is the autorun triggered? It is when any of its direct dependency is changed. And when is reaction triggered? It is when the return value from expression part changes. So an eager calculation would be needed to know whether the autorun or reaction would actually become triggered. But after it becomes triggered, further recalculation would not be needed until they are actually used.

upsuper commented 9 months ago

I think this behavior is useful in many cases when changes from upstream dependency doesn't actually affect the result, like in my example above doing Math.round. It avoids triggering the autorun unnecessarily, and I think conceptually it is also cleaner.

realyze commented 9 months ago

Ah yes, okay, you're right that autorun fires even when values of dependencies change and then change back within the same delay period (i.e., it runs even when after delay elapses, the dependencies are the same as they were originally). So you're right that my mental model is different than the implementation. 🤔

So an eager calculation would be needed to know whether the autorun or reaction would actually become triggered

Hmmm I don't think that's necessarily true for reactions? Disregarding implementational details, we should be able to mark the dependencies as "potentially dirty" without recalculating them (that's actually what mobx does internally). And to communicate to the reaction that it has become "potentially dirty" too and will—at some point in future—need to recalculate its expr function to determine whether it should run the side-effect. But I don't see why we'd need to eagerly recalculate the value returned from the expr function?

upsuper commented 9 months ago

But I don't see why we'd need to eagerly recalculate the value returned from the expr function?

Same as above. Let's say

const box = mobx.observable.box(0);
mobx.reaction(() => Math.round(box.get()), v => console.log(v), { delay: 1000 });
await delay(1400);
box.set(0.1);
await delay(200);
box.set(0.9);

When would you expect the side effect be executed? Should it be 2400 (box.set(0.1) + 1000 delay) or 2600 (box.set(0.9) + 1000 delay)? I think the expectation here is to run it at 2600 when the expression result actually changes at 1600 with a delay of 1000.

Now, if we can mark it "potentially dirty", we would make it potentially to-be-executed at 2400. As you said, we may delay that calculation to 2400, at which point we run the calculation and found that the result changes (because box.set(0.9) is already done), so we execute the side effect at 2400 (rather than 2600), which would feel more confusing. Making the calculation lazy while still triggering the side effect at the right moment would require the expression to be evaluated at a time in the past, which I don't think is possible given how MobX is used in general.

Another alternative is to do the calculation at 1600 just before box is updated again, but I don't think that saves us anything.

taj-p commented 9 months ago

Thanks @upsuper and @realyze. Using @upsuper's flow, I believe the below is what we want and most closely abides by the existing implementation:

const box = mobx.observable.box(0);
mobx.reaction(() => Math.round(box.get()), v => console.log(v), { delay: 1000 });
await delay(1400);
box.set(0.1);
await delay(200);
box.set(0.9);

Current State


                                      MobX uses a debounce on first invalidatio
                                     so we delay by 1s since first "dirty" stat
                                        ┌────────────────────────────────┐
                                        │              1s                │
                                        │                                │

 0     XXXX           1400             1600                            2600
 ─────────────────────────────────────────────────────────────────────────────
      setup       box.set(0.1)      box.set(0.9)                       run effect
      reaction
                    recompute        recompute
                    expression       expression

                    reaction is not  reaction is
                    invalidated      invalidated

In the current state, the proposed flow re-runs the reaction's expression twice.

Ideal State


                      MobX uses a debounce on first invalidatio
                     so we delay by 1s since first "dirty" stat
                        ┌────────────────────────────────┬───────────────┐
                        │              1s                │               │
                        │                                │               │

 0     XXXX           1400             1600            2400            2600
 ─────────────────────────────────────────────────────────────────────────────
      setup       box.set(0.1)      box.set(0.9)                       run effect
      reaction                                        recompute
                   mark reaction                      dependencies
                   as maybe dirty

In an ideal world, we only check for invalidation just before we need to run our effect to minimize work done during invalidation processing.

The expression function doesn't need to run at 2400 nor 2600. When we set up the reaction, we ran the expression immediately so MobX could build its derivation graph, so we are already aware of which observables may invalidate this reaction at 1400.
Thus, at 1400, our reaction first realises it may be invalidated and runs its runReaction_.

This is where we diverge. In the existing implementation...

At 1400, we hit this shouldCompute block to determine whether a previously updated observable has changed. This tells us, definitively, whether any dependency has changed by recomputing all derivations.

This is the crux of this issue. We don't need to perform this work at 1400. Even if our dependencies haven't changed, can't we wait until 2400 to re-compute our dependencies to check whether any observable has changed? If they haven't, then we debounce to 2600 and retry (since that was the last time our dependencies were updated!).

So, if, for example, you set the box to (0) at 1400, can't we wait till 2400 to determine whether our dependencies have meaningfully changed?

In the current implementation, we recompute all computeds IIUC. Consider the impact of the below:

const box = mobx.observable.box(0);
mobx.reaction(() => Math.round(box.get()), v => veryExpensiveWork(v), { delay: 1000 });
await delay(2000);
for (let i = 0; i < 1_000_000; i++) { await delay(10); box.set(i); }
// That's a lot of re-derivation work we just did!

The proposal says: "let's not re-compute every 10ms and, instead, only check for invalidation once our scheduler says to".

But, admittedly, the devil might be in the details of the implementation! As @upsuper mentions, how do you know whether invalidation actually occurred at 1400 or 1600 without running an invalidation test at both points in time?

I have a draft POC here - this is the test where we start to break down.

I don't think it's possible for us to reliably determine whether invalidation occurred at 1400 or 1600, but the value of not running derived state synchronously when the effect is delayed is valuable.

As seen in the PR description, we do a lot of work in a render task that, sometimes, isn't used for 60 frames or more. I wonder if a maxDelay|runReactionMaxDelay or similar API has merit.

mobxjs / mobx

[performance] autorun with delay deriving state unnecessarily #3724

Current State

Ideal State