Derived stores are re-evaluated on invalid upstream state

Describe the bug

Note: The following block quote is from #9458.

Currently, a change to a store's value only causes stores which are immediately dependent on it to be invalidated. Svelte does not propagate invalidation downstream to subsequent derived stores, and this can cause issues when two criteria are met:

the store dependency graph forks and merges again at least once; and

the two or more paths between the fork and merge points are of unequal length.

Svelte's current implementation correctly handles dependency diamonds, but such cases do not meet the second criterion; unequal path length.

Example

Consider the following dependency graph—A is a readable store, while B and C are derived stores.
stateDiagram-v2
  direction RL

  # classDef Readable font-weight:bold,stroke-width:3
  # class C Readable

  B --> A
  C --> A
  C --> B
Svelte's Current Implementation

In Svelte's current implementation, when the value of store A changes, stores B and C are each invalidated once. This is recorded in the pending variable in derived(). Immediately afterwards, both stores B and C re-evaluate, store C using the old value of store B that it recorded in its private values array during initialisation of the store graph.

Once stores B and C have changed value, they invalidate their subscribers. Store C has no subscribers to invalidate, but is invalidated itself by its dependency, store B. This causes store C to re-evaluate a second time, this time reaching the correct value.

The first re-evaluation of store C is wasted. This is arguably out of keeping with Svelte's efficiency, but is also—and more importantly—performed on invalid state. An exception could be thrown if, for instance:

Store A held an array of objects.

Store B held an index within that array.

Store C held a property of the element of the array at that index.

The array stored in store A were to shorten below the index in store B.

In this case, the index (store B) will become out of range, the object will become undefined, and property access on that value will yield a TypeError.

I suspect that JavaScript's loose typing allows a lot of these premature re-evaluations to go unnoticed in many real-world applications, generating short-lived NaNs and undefineds rather than throwing errors. I did encounter this issue organically, though.

See my PR (#9458) for a different implementation of svelte/store which avoids this issue. My implementation treats derived stores, in their capacity as a store subscriber themselves, with extra attention. This enables more accurate tracking of derived store validity and ensures that re-evaluation doesn't happen on invalid upstream state.

Reproduction

The following example has sound application logic. If the two derived stores were combined into one, it would work as expected. Instead, Svelte re-evaluates the assertion store twice for every change in the writable store number. The first of each of these re-evaluations feeds invalid state to assertion's update function, producing an invalid output, e.g., "3 × 3 is even".

<!-- App.svelte -->
<script>
  import { onDestroy } from 'svelte';
  import { writable, derived } from 'svelte/store';

  const values = [];

  let number = writable(1);
  const triple = derived(number, ($number) => $number * 3);
  const assertion = derived(
    [number, triple],
    ([$number, $triple]) => `${$number} × 3 is ${$triple % 2 === 0 ? 'even' : 'odd'}`,
  );

  let assertions = [];
  const unsubscribe = assertion.subscribe(a => assertions = [...assertions, a]);

  onDestroy(unsubscribe);
</script>

Increment the field value and observe the messages.<br>
<input type="number" bind:value={$number}>
<pre>{assertions.join('\n')}</pre>

Logs

Regarding logs, no exceptions are thrown unless the invalid state causes the derived store's update function to throw one. I encountered this issue while writing a derived store which attempted to access a property of an object in an array, with both the array and index being the values of other stores. In this case, the error would show "Uncaught TypeError: can't access property "foo" of undefined."

System Info

n/a

Severity

Blocking all use of svelte/store.

Moving context from #10451

Example

Consider the following example:

const a = writable(1);
const b = derived(a, a => a*2);
const c = derived([a,b], ([a,b]) => a+b);
c.subscribe(c => console.log(c));
...
<input type=number bind:value={$a} />

This creates a dependency graph something like:

stateDiagram-v2
    direction RL

    input
    a
    b
    c
    log

    log --> c
    c --> b
    b --> a
    c --> a
    a --> input

Svelte's Current Implementation

With sveltes current implementation, the derived store c will prematurely evaluate every time store a changes. In the example above, if we change the input to 2, the current implementation will go through the following sequence:

sequenceDiagram
    autonumber

    participant input
    participant a
    participant b
    participant c
    participant log

    note right of a: a == 1
    note right of b: b == a * 2 == 1 * 2 == 2
    note right of c: c == a + b == 1 + 2 == 3

    input ->> a: 2
    note right of a: a == 2
    a -->> c: invalidate
    activate c
    a -->> b: invalidate
    activate b
    a ->> c: 2
    deactivate c
    rect rgb(255, 128, 128)
    note right of c: c == a + b == 2 + 2 == 4
    c ->> log: 4
    end
    a ->> b: 2
    deactivate b
    note right of b: b == a * 2 == 2 * 2 == 4
    b -->> c: invalidate
    activate c
    b ->> c: 4
    deactivate c
    note right of c: c == a + b == 2 + 4 == 6
    c ->> log: 6

Following the diagram, it's clear at point (5) that store c is evaluating prematurely. At point (5) store c believes both its dependencies to be valid, but in fact only a has been resolved at this point.

The correct behaviour would be for invalidations to be "deep" and for resolution to only occur once all dependencies are fully resolved. Thus in the given example, c should only emit once for each change to a.

Why fix it?

Prematurely evaluating a derived store in many situations would result in just a brief glitch - an incorrect calculation immediately followed by the correct one. But in many contexts, the derived store subscriptions result in side effects. Depending on the nature of these side effects, the results of a premature evaluation with incorrect data may be quite pronounced - sending data to a service, permanently modifying data, crashing an application.

Reproduction

See: semi-diamond dependency problem REPL

As a head start for any patch request, here is a test case:

    it('only updates once dependents are resolved', () => {
        const a = writable(1);
        const b = derived(a, a => a*2);
        const c = derived([a,b], ([a,b]) => a+b);

        const values: number[] = [];

        const unsubscribe = c.subscribe(c => {
            values.push(c);
        });

        a.set(2);
        a.set(3);
        assert.deepEqual(values, [3, 6, 9]);
    });

note that the above test case fails with the current implementation

@mnrx, what was you basic solution for resolving this? The solution I went for was to have an "invalidate" and "revalidate" option for the subscribe method. A deep invalidation was done whenever a change was made and the revalidation was necessary to support alternate equality checks and thus wouldn't be necessary with the default greedy implementation.

In the context of fixing just this issue, I believe simply performing a deep invalidation may be sufficient which would be a much smaller code change.

I've written a preliminary patch to resolve this (see https://github.com/WHenderson/svelte/tree/develop/stores-diamond-dependency) and in doing so I've noticed the test case "derived dependency does not update and shared ancestor updates". It is an exact mirror of the example test case I have given and documented in the sequence diagram above.

As it stands, this existing test case codifies the exact behaviour we are trying to avoid by raising this issue.

This decision appears to have been made in response to #3191 / bbeafbafab0e10e4f88f898890d97f5e508f5609 @btakita , I can't tell if the test case you've written is to check for desired behaviour or to codify current behaviour. Are you able to weigh in? I know its been some time!

FYI, this is what I believe the correct behaviour should be:

sequenceDiagram
    autonumber

    participant input
    participant a
    participant b
    participant c
    participant log

    note right of a: a == 1
    note right of b: b == a * 2 == 1 * 2 == 2
    note right of c: c == a + b == 1 + 2 == 3

    input ->> a: 2
    note right of a: a == 2
    a -->> c: invalidate
    activate c
    a -->> b: invalidate
    activate b
    b -->> c: invalidate
    a ->> c: 2
    a ->> b: 2
    deactivate b
    note right of b: b == a * 2 == 2 * 2 == 4
    b -->> c: invalidate
    b ->> c: 4
    deactivate c
    note right of c: c == a + b == 2 + 4 == 6
    c ->> log: 6

Notably, the derived store doesn't compute until all of its dependencies are resolved.

Thank you for spending time on this.

Abandoned Approach

What was your basic solution for resolving this?

After giving some thought to the general problem of managing reactive state with dependencies, I noticed that this bug can be avoided by only ever re-evaluating stores in the order that they are defined. Using the Svelte store API, at least, I don't believe it's possible to create a store that is dependent on another store which is defined later in the source code. A different store API or a different programming language may make this possible, but would also open the door to dependency rings in doing so.

On a side note, Rich Harris mentions spreadsheet software in an old presentation of his, pointing out the similarity of keeping an application UI up-to-date to keeping formulaic cells in a spreadsheet up-to-date. Even now, trying to create a circular dependency in Numbers on macOS yields the following error: "This formula can’t reference its own cell, or depend on another formula that references this cell." I think the limitation of only being able to depend on already-defined stores is reasonable, and simplifies store implementation.

With this in mind, I modified writable to include an ID in the internal state of each store it created, starting from zero and incrementing by one for each store. In theory, this would allow for a simple re-evaluation algorithm:

propagate an invalidation message downstream to all dependent stores;
push each store's ID to an array as it is invalidated;
sort and de-duplicate the resulting list; and
re-evaluate each store referenced in the array in the order they appear in it.

In practice, this solution moves a significant amount of logic out of the stores an into some kind of "resolver" that monitors them. It also required keeping a reference to every store created so that those stores could be later accessed by ID. Thirdly, I had a feeling that delayed creation of stores, such as in a component not shown when the application loads, would present further issues. These and other difficulties led me to abandon this approach in favour of that described in PR #9458. That approach is described in the PR and in the points I make in the next section, but please ask questions if those explanations don't cover enough.

Ultimately, I don't believe it possible to build a store implementation that handles complex graphs correctly unless a control layer is created above the stores, or the individual stores have more knowledge of their place in the graph. This context is provided in the algorithm above by a store's ID, and it's provided in my implementation by the invalidation step.

Diagram and Patch Implementation

Your diagram and patch implementation seem to match each other, but I think I can identify two issues with them. Consider the following store graph—stores A and B are writable and the rest are derived:

stateDiagram-v2
  direction RL

  # classDef Readable font-weight:bold,stroke-width:3
  # class C Readable

  E --> B
  C --> A
  D --> A
  E --> C
  E --> D
  F --> E

Stores must keep a counter of invalidations rather than a boolean.

In the example, when store A updates, store E will be invalidated twice. It's important that store E only re-evaluate after receiving exactly two update notifications from upstream stores. If store E only uses a boolean value to store its invalidity, the first update notification, likely from store C, will cause it to re-evaluate prematurely.

It's also worth noting that store B did not update, so only two out of store E's three dependencies updated. For this reason, a derived store cannot simply wait until it has received as many update notifications as it has dependencies before re-evaluating.
Stores must only invalidate their dependants when their value changes as the result of a direct call to their public set method, but not when it changes as the result of re-evaluation. In other words, derived stores should not invalidate their dependents when re-evaluating, only when they are themselves invalidated for the first time. This means that step 7 in your chart should be removed.

A derived store will never receive more update notifications than it has dependencies, so it's important that a derived store is only ever invalidated at most once by each of its dependencies. If a derived store is invalided too many times, it will freeze and never re-evaluate. If a derived store is not invalided enough times, however, it'll re-evaluate prematurely.

In the example above, store F has only one dependency, and so will only ever receive a maximum of one update notification. So that it doesn't freeze, it must be invalided only once, immediately after it's sole dependency, store E, is invalidated for the first time.

In your patch implementation, the counting logic is handled by the bitwise operations on the pending variable. A bit field is used, I presume, to account for the possibility of multiple update notifications originating from the same dependency store, but this will only ever occur if derived stores are evaluated prematurely.

I will confess that I still believe a re-implementation is required rather than a small patch. I found the current implementation particularly inscrutable when I was tracing the origin of the bug. Extra complexity is also introduced, for instance, by Svelte passing both set and update functions to asynchronous stores. These functions may be called multiple times back-to-back during any given update, which further complicates invalidation and re-evaluation. This is avoided in the current implementation by not paying any attention to the order in which re-evaluation occurs.

Tests

I tested your patch implementation against the test suite in my original PR, and there are three more cases that fail. The relevant ones start on line 523 of tests/store/test.ts. The following tests fail:

"derived dependency does not update and shared ancestor updates"
"does not freeze when depending on an asynchronous store"
"disables set and update functions during on_start clean-up (on_stop)"
"disables set and update functions during derived clean-up"

Lastly, issue #3191 describes a valid problem, and I agree with the reporter's expected behaviour. My implementation passes the corresponding test, and I'm not sure why your patch implementation doesn't—I suspect it relates to point 2 above.

...In the example, when store A updates, store E will be invalidated twice

Hi @mnrx , the stores aren't keeping a simple boolean invalidation state. They are keeping a boolean for each dependency and will only resolve/execute once all are in a valid state. This is the way the existing code works, my only addition in the context of this patch is to do it "deeply".

I had originally made this explicit in the diagrams, but figured it overcomplicated them and distracted from the original message.

To be 100%, I implemented a test case for what you described and it works as desired.

Stores must only invalidate their dependants when their value changes as the result of a direct call to their public set()...

I disagree. Stores are set to invalid (perhaps a poor choice of word) when their value becomes indeterminate/pending. Until an invalid derived is re-executed, it cannot be determined if it will evaluate to the same value.

In your patch implementation, the counting logic is handled by the bitwise operations on the pending variable This is the original code and in the case of diamond dependencies or semi-diamond dependencies, the bitwise logic is necessary and will at times contain more than one bit set. In the previous sequence diagram (the one relating to the fixed version) at step 4, the value of the bitwise field will be 3 (0x11) to indicate that both inputs are in an invalid state

Tests

"derived dependency does not update and shared ancestor updates" Are you sure, this passes fine on mine?

"does not freeze when depending on an asynchronous store" As per # 2 above, I would expect the values to come out as [3,2,8]. Each call to length.set leaves lengths_derived in an invalid state until it is set using do_later. In this context, the async derived is behaving like a promise in the pending state. It's value is yet to be calculated.
"disables set and update functions during on_start clean-up (on_stop)"
"disables set and update functions during derived clean-up" I'm not sure fixes for these would relate to this specific issue. Would it not be better to address them in a separate issue?

Thank you for implementing that graph and testing against it.

Until an invalid derived is re-executed, it cannot be determined if it will evaluate to the same value.

This is true—I should have mentioned another relevant detail. In my implementation, stores will emit an update notification even if their value has not changed, but only if the subscriber is a derived store. This notification includes a boolean value passed with it; isChanged. This notification allows a derived store to always decrement its invalidity counter. Derived stores only re-evaluate if at least one of their dependencies issued a real update, as opposed to an empty update notification. An empty update notification represents the sentiment "previously, I warned you I was about to update, but I actually won't—false alarm!"

Looking again at your patch implementation, I think the invalidation initiated by derived stores may be unnecessary but it would require more logic to disable that step for derived stores specifically.

In this context, the async derived is behaving like a promise in the pending state. It's value is yet to be calculated.

This is fair, but would your preferred behaviour be for the whole store graph to be delayed in updating because one store is asynchronous? Svelte supports stores that change their value asynchronously, such as a store which updates its value after a time delay:

const foo = writable(false);
const fooChangedRecently = derived(
  [foo],
  ([$foo], set) => {
    set(true);
    setTimeout(() => set(false), 3_000);
  },
  false,
);

In this example, the store fooChangedRecently will typically emit two updates for every one update emitted by the store foo. It seems clear in this case that the synchronous call to set should allow the store graph to continue updating and the delayed call to set should start a new round of re-evaluation. I don't think that whether or not set and update receive at least one synchronous call between them should significantly change the general behaviour of the store graph.

Critically, if a derived store's value is to be considered invalid from the point of re-evaluation until an asynchronous action completes, I think the author should include an explicit synchronous call to set in that store's deriveValue function (named fn in the current Svelte implementation) in order to set the store value to a base value, e.g., undefined. I don't think this approach makes impossible any use case that the alternative enables.

On a related note, I should have mentioned that asynchronous calls to set and update from inside a derived store's deriveValue function will trigger normal store updates in my implementation, which caveats my prior point that derived stores should only invalidate their downstream stores when they are invalidated themselves for the first time.

Validity Counter vs. Bit Field

Some of our disagreement about the details of the algorithm probably stems from my decision to use a counter to represent invalidity rather than a bit field, as used by the current Svelte implementation. The bitwise logic seemed opaque to me, and limits derived stores to (probably) at most 32 dependencies—no dire limitation, to be sure. I opted to use a counter, which requires exactly as many invalidation notifications as update notifications to flow downstream in the graph.

Alternatively, I could have used a Set holding dependency indices, which would still remove the dependency limit and make the logic more readable while also not introducing a need to track invalidity exactly. This approach may be a more robust one, if slightly more memory-intensive.

Tests

"derived dependency does not update and shared ancestor updates"
Are you sure, this passes fine on mine?

This test case is still failing for me. It was marked to be skipped in your patch—was that change intentional?

I think the test fails because store a has notified store b during invalidation that it's about to update but never actually notifies it's subscribers of a new value. This is because it happens to re-evaluate to the same value as it already had and thus does not need to. If this is the root of the bug, some form of empty update notification like I described above should resolve it. To do this, the store logic in set will need to determine whether each of its subscribers is a derived store or a regular subscriber, or somehow else avoid calling regular subscribers with unchanged values.

Disabling `set` and `update`

This could certainly be its own issue, but I mentioned since it since it pertains to asynchronous calls to set and update. The difference is subtle, but makes for a less error-prone API, in my opinion. The following was an addition I made to the store documentation in PR #9458:

If set or update are called after the store has lost its last subscriber, they will have no effect. You should still take care to clean up any asynchronous callbacks registered in on_start by providing a suitable on_stop function, but a few accidental late calls will not negatively affect the store.

For instance, in the example below, set may be called late if the issPosition store loses its last subscriber after a fetch call is made but before the corresponding HTTP response is received.
import { readable } from 'svelte/store';

const issPosition = readable({ latitude: 0, longitude: 0 }, (set) => {
  const interval = setInterval(() => {
    fetch('http://api.open-notify.org/iss-now.json')
      .then((response) => response.json())
      .then((payload) => set(payload.iss_position));
  }, 3000);

  return function onStop() {
    clearInterval(interval);
  };
});

issPosition.subscribe(({ latitude, longitude }) => {
  console.log(
    `The ISS is currently above ${latitude}°, ${longitude}°` +
      ` in the ${latitude > 0 ? 'northern' : 'southern'} hemisphere.`
  );
});

Without the disabling of set and update, unexpected behaviour could be encountered after the quick stopping and starting of stores, such as could occur when dependent UI components are quickly hidden and shown. Conceptually, to me at least, a store losing its last subscriber and then gaining a new one should result in behaviour reminiscent of a computer being power cycled—persistent state is retained, but any pending asynchronous operations are lost.

This behaviour can be implemented by the application author if it isn't implemented by the store library, but I can't think of a situation in which the latter would cause a problem.

Looking again at your patch implementation, I think the invalidation initiated by derived stores may be unnecessary but it would require more logic to disable that step for derived stores specifically.

The invalidation initiated (continued) by derived stores is what provides a solution to the semi-diamond dependency problem. without this, the sequence is exactly the same as sveltes current implementation.

This is fair, but would your preferred behaviour be for the whole store graph to be delayed in updating because one store is asynchronous?

This is the current behaviour of async svelte stores. The patch doesn't really address any of the operational behaviour of async stores, it seeks only to address the "diamond dependency" issue in a more thorough way.

If we are to look into addressing some of the issues of async stores I would look to do that in a separate issue.

Validity Counter...

This is a different issue best addressed in a separate issue.

derived dependency does not update and shared ancestor updates

I misread your earlier post. You are correct, it does fail with the changes. This one is referenced in my post above with the link to my patch along with comments that describe the problem.

Disabling set and update

I agree that this is something of a foot gun, but should be raised as a separate issue.

As it stands, I think the existing patch addresses the specific issue raised in this post.

It would be good to hear from the maintainers about #3191 / bbeafbafab0e10e4f88f898890d97f5e508f5609, but it may be best to do that via a PR.

Correction

derived dependency does not update and shared ancestor updates

This is actually failing because I tried to skip part of my original solution (revalidation). Guess I'll bring that into the patch then :-)

The invalidation initiated (continued) by derived stores is what provides a solution to the semi-diamond dependency problem.

That invalidation is necessary, yes, but the following round of invalidation initiated by the derived store setting its new value is not. It would require more logic in set to determine if any given update was the result of re-evaluation or not, though, and extra invalidations are not a problem when using a bit field or Set to track validity. It's not a problem, only an observation.

This is the current behaviour of async svelte stores.

At least in my testing, your patch implementation outputs [3, 2, 8] for the test case I mentioned before, "does not freeze when depending on an asynchronous store". The current Svelte implementation and Rich's PR, by contrast, output [2] and freeze. I'm not sure how, but your patch partly resolves this issue.

If I'm interpreting issue #3191 and PR #3219 correctly, Svelte did have an algorithm like we're proposing before. To quote an old version of the codebase, "run_all(invalidators);". This appears to have been removed for the sake of performance—debateable prioritisation, in my opinion.

Regarding creating separate issues, that's fair.

That invalidation is necessary, yes, but the following round of invalidation initiated by the derived store setting its new value is not. It would require more logic in set to determine if any given update was the result of re-evaluation or not, though, and extra invalidations are not a problem when using a bit field or Set to track validity. It's not a problem, only an observation.

Yes, you are correct. Good catch. It wouldn't be hard to avoid this, as the store already tracks its validation state.

At least in my testing, your patch implementation outputs...

I'll have to take a look at this when I have some time.

"does not freeze when depending on an asynchronous store"

svelte 4 [3, 2, 5, 8, 11, 8]

at the point where length.set is being called, lengths_derivative erroneously believes all its inputs to all be in a valid state, so it calculates based off of the new $length value and the old do_later result. i.e.

[2,1] => 2*3-1 => 5
[3,1] => 3*3 - 1 => 8
[4,1] => 4*3 - 1 => 11

my patch #10557 [3,2,8]

When length.set is called it first marks all dependencies as invalid and since lengths_derivative doesn't change its value, dependencies on lengths_derivative remain invalid. Its only when do_later is eventually called that it propagates a change. That change is a closure around the last $length value. Subsequent calls to do_later set the same value which is ignore via safe_not_equal.

Rich's patch #10575 [3,2,8,8] Same as mine, except due to the way the set function has been changed subsequent calls to do_later with the same value are not ignored.

Hi, just want to chime in:

the problem of managing derived store had been pretty studied in incremental computation. there are basically 2 approaches, which is the two approach @mnrx figured out, with some slight modification.

The eager approach This is very close to the abandoned approach, but:

use order maintenance instead of int to sort store. order maintenance essentially allow creating derived_store later, even if some earlier store is modified to listen to it.
have a priority queue of stuff that need evaluation, indexed by order maintenance tag. note that value is not transitively added to the priority queue: if a is dirtied and b depend on a, b does not get added to the queue (unless b is also dirtied)
have a refresh() function, which pop stores off priority queue one by one, re-evaluating and dirtying their dependents.
you can call the refresh() function right after the external program write to a store, or e.g. once every 1/60 seconds, and saving evaluations.
if re-evaluating a store give the same result, there is no need to enqueue it's dependent.

see: Self-Adjusting Computation

The lazy approach This is like the one where @mnrx use int to track invalidation count, but use a boolean bit instead.

on dirtying, recursively look at dependents and dirty them too (no need to recurse if they are already dirtied)
when a value is request()ed, if it is not dirtied return it. if it is dirtied, request the value the store depend on (recursive), set the value and undirty itself, and return it.
to reevaluate everything, just request() all store in arbitary order. see: http://adapton.org/

pros of the eager approach:

allow the 'skip same result' optimization

pros of the lazy approach:

simpler
if you need only some value, will not re-evaluate unneed store (but still have to dirty them)

none is strictly better then the other.

Hi @MarisaKirisame, this issue is now linked to #10575 (Rich Harris) and #10557 (mine). These solutions, as well as the current svelte implementation, use something similar to the lazy approach you describe.

The question at this point is about how best to minimally extend the API to support un-dirtying a store. Take a look at the linked PR's to see the discussions there.

sveltejs / svelte