Volatile sources in the computation graph

PsychoLlama commented 1 month ago

Integrating with the platform inherently requires volatile functions (borrowing the term "volatile" from Excel formulas). The platform is filled with impure properties and functions that return different values when called over time. Sometimes they expose events for detecting changes. Sometimes they don't.

Here are some concrete examples:

Reading from localStorage
Observing location.hash
Subscribing to a media query

From what I've seen, the common pattern for integrating these APIs is to synchronize them into a signal replica:

const signal = new Signal.State(location.hash);

window.onhashchange = () => {
  signal.set(location.hash);
};

This works, but it means every integration has a global listener that survives for the lifetime of the application. Consider a library of bindings like useHooks. This is a non-starter. It only grows and adds cost with every binding.

So we optimize: only subscribe to the API when the value is actually used (meaning: under observation).

const signal = new Signal.State(location.hash, {
  [Signal.subtle.watched]() {
    window.onhashchange = () => {
      signal.set(location.hash);
    };
  },

  [Signal.subtle.unwatched]() {
    window.onhashchange = null;
  },
});

The Bug: This works, but only some of the time. Values can still be read when they are not being watched. Reading an unwatched signal will give the stale value.

// Is this value correct? Who knows. Only if an observer happened to capture it.
signal.get();

Imagine observing location.hash in a component, then a click event fires an async task and navigates away. The task finishes and uses signal.get(), but since the original component is no longer observing it, the value has gone stale. The effect completes with invalid data.

While this is a consequence of The Way it Works :tm:, it leaves a lot of space for bewildering footguns. To make this robust you need to know if the value is being observed and branch, either using the signal or reading the value from source. This applies to trees of Computed sources too. It isn't clear how a framework would solve this without devolving to "observe all platform bindings, all the time, forever".

I'm not the first one to notice this. It's a recurring theme in other issues:

https://github.com/tc39/proposal-signals/issues/227 (attempt to bind localStorage resulting in similar issues)
https://github.com/tc39/proposal-signals/issues/165 (sketch of a very similar idea from dead-claudia)
https://github.com/tc39/proposal-signals/issues/9 (tangential, but good example of Math.random as a volatile source)

Proposal

Ultimately the challenge comes from maintaining two sources of truth: one in the platform and one in the signal. We can't keep the signal state fresh without permanently listening to the platform, and this causes different behaviors when observed and not observed. So I suggest we don't try.

Instead, I propose (bear with me) a new Signal.Voltile source that reads the value directly:

const signal = new Signal.Volatile(() => location.hash);

Every signal.get() uses the getter to pull the value. It is never stale, even when unobserved.

Unfortunately much like Excel, this has the effect of busting the cache for every computed down the chain. It's rather extreme. We can avoid it by tapping into change handlers for features that support it:

const signal = new Signal.Volatile(() => location.hash, {
  subscribe(onChange) {
    window.onhashchange = onChange;

    return () => {
      window.onhashchange = null;
    };
  },
});

In this hypothetical example, volatile signals with subscribe handlers would become non-volatile when observed (same cache semantics as signals) and revert to volatile when not observed (maintaining correctness when read outside Sauron's watchful gaze).

I think the majority of platform bindings fall under this style, as does integrating with any external store.

Adding a new primitive is rather extreme, but for the life of me I can't figure out how to reconcile this with signals. I appeal to spreadsheets because it seems they haven't solved it either. Forgive my hubris.

EisenbergEffect commented 1 month ago

I quite like this.

PsychoLlama commented 4 weeks ago

I have a working proof of concept in my fork: https://github.com/PsychoLlama/signal-polyfill/pull/1/files

The goal was to explore the idea and eke out any unexpected consequences. Here's what wrinkled my forehead:

Can volatile functions depend on other states/computeds?
Can volatile sources read from other volatile sources?
What happens if you call onChange immediately during the subscribe callback?
What happens if you call onChange during a Watcher callback?
What should happen if the subscribe handler throws?

I answered "yes" to questions 1 & 2 and left the rest unhandled.

One of the more surprising outcomes was how similar it was to Signal.Computed. Almost every design decision logically followed what Computed had done. The bits left unimplemented (error handling, circular functions) are already handled by Computed.

It left me wondering: should this be a mode on Computed instead?

new Signal.Computed(() => location.hash, {
  volatile: true,
  // ... provide some mechanism to upgrade to non-volatile with an `onChange` handler ...
})

On a similar line, dynamically upgrading a signal from volatile to non-volatile depending on whether it's observed is very nice ergonomically, but it can be implemented with the existing tools if you're willing to create a few extra nodes:

const volatileHash = new Signal.Computed(() => location.hash, {
  volatile: true,
})

const dynamicSource = new Signal.State(volatileHash, {
  [Signal.subtle.watched]() {
    queueMicrotask(() => {
      dynamicSource.set(new Signal.State(location.hash))
      // set up change handlers
    })
  },
  [Signal.subtle.unwatched]() {
    dynamicSource.set(volatileHash)
    // tear down change handlers
  },
})

const hash = new Signal.Computed(() => {
  const maybeVolatile = dynamicSource.get()
  return maybeVolatile.get()
})

Essentially creating a isObserved signal that, when you watch it, swaps out the volatile Computed for a non-volatile State that subscribes and keeps its cache updated.

In summary, I think I can drastically reduce the scope of this proposal by only adding a volatile mode to Computed. No subscribe-to-upgrade or new Signal.Volatile source.

EisenbergEffect commented 3 weeks ago

I think we've had some requests for un-cached computeds from other people. I'm not sure if we've gathered all the use cases around that yet, but this seems like a pretty solid use case to add to the list. /cc @littledan

littledan commented 3 weeks ago

This is a great issue, thank you for filing this. I agree that watched/unwatched present a bit of a footgun at the moment. What if we made the semantics that, if you have a watched callback, but the computed signal is not being watched, then the signal is treated as "always dirty"/uncached?

littledan commented 3 weeks ago

On second thought, I'm not sure that strategy would work so well for the Signal -> Observable conversion, since you'd want that to really just throw when it's not watched.

shaylew commented 3 weeks ago

Aha, noting that volatility is (forward-)contagious is very useful here!

... But unlike "is loading" or "has read data from a transaction" or other proposed forward-contagious properties, it does kind of want to affect the behavior of baseline computeds, not just propagate transparently through them.

The other open question it's adjacent to is: what happens when a signal, or one of its dependencies, gets re-dirtied while checking stale dependencies? Or more generally, what do you do during get when, after checking and cleaning each upstream dependency once, you fail to reach a state where they're all clean?

If you interpret the state of a Computed as being supposed to represent a bound on the states of its dependencies ("a Computed being t-clean for timestep t implies all its dependencies are also t-clean") then you are kind of forced to say that, if cleaning the dependencies during .get doesn't actually leave them fully clean, the Computed you called get on had better not be marked clean afterwards either. That creates the right notion of contagious volatility, and all you need is a widget (built-in or constructed) to produce the original never-clean node.

(I'd have to play around to see whether that original node is expressible without extra constructs; you can definitely make it dirty enough, but you also have to prevent watchers from being able to detect your re-dirtying strategy.)

divdavem commented 3 days ago

I was initially very interested in this suggestion and I have played a bit with Signal.Volatile (even writing an alternative implementation here), but I especially don't like the following behavior:

const volatile = new Signal.Volatile(() => count++);
const dep1 = new Signal.Computed(() => volatile.get());
const dep2 = new Signal.Computed(() => volatile.get());
const result = new Signal.Computed(() => `${dep1.get()},${dep1.get()},${dep2.get()},${dep2.get()}`);
console.log(result.get()); // displays 0,1,2,3

When not live, there is no consistency of the value used for the volatile in the result.

I am suggesting another option: change the spec to automatically make live all signals when they are read (for the duration of the call to get): https://github.com/proposal-signals/signal-polyfill/pull/32

tc39 / proposal-signals

Volatile sources in the computation graph #237

Proposal