tc39 / proposal-signals

A proposal to add signals to JavaScript.
MIT License
2.87k stars 54 forks source link

Cross-Scheduler Effect Cycle Detection #199

Closed pjeby closed 20 hours ago

pjeby commented 2 weeks ago

tl;dr: multiple effect schedulers sharing signals means no way to detect or debug cross-scheduler effect loops... unless they can co-ordinate. Below, I propose a simple mechanism for such co-ordination, based on experience with a smaller multi-scheduler system, that can be implemented using the epoch and last-read fields the current polyfill is already tracking internally.

Effect Cycles

Unless a framework specifically disallows it, an effect can change state, resulting in the triggering of other effects. Many frameworks disallow this for any state the effect directly depends on, which is helpful, but doesn't fully resolve the problem of indirect cycles. (It's actually fairly common for effect schedulers to punt on this by just bailing out after a set number of recalculation sweeps, without even trying to identify precisely where the loop formed.)

But it's not clear to me how an effect scheduler using the proposed interop library would even do that much, as soon as more than one scheduler were involved. After all, a major part of the proposal's purpose is that you can have multiple such schedulers running independently!

Imagine, for example, one effect scheduler that runs on the next microtask after notification, and another that waits until an animation frame. Yet another runs on a set interval. How would any of these schedulers notice that their effects collectively form a loop, let alone assign responsibility for which effect(s) are involved?

To be fair, it's not even theoretically possible to solve this issue when non-effect scheduling is involved. That is, if an effect fires off an async function that later changes state, there is absolutely nothing to be done about such "slow loops". (At least, not without co-operation of the Javascript implementation!)

But it is possible to solve the problem of detecting "fast" effect cycles (i.e ones created during effect execution), even in a multi-scheduler environment, even where the effects don't run on the same clock.

I know, because I already had to solve this issue in a library I'm currently developing, which allows the user to create arbitrary effect schedulers that run effects at different sampling rates or during specific time periods (such as animation frames).

Timestamp-Based Cycle Detection

The library uses a timestamp-based tracking system very similar to that of preact-signals or the current polyfill, but with a couple of minor twists:

Global, cross-scheduler cycle detection is then a simple matter of disallowing effects from changing any state that would advance the global timestamp -- i.e., writing to a state that has already been read within the current timestamp. (In other words, only non-effect code is allowed to update state in a way that will cause up-to-date effects to re-run.)

This approach has extremely high precision: you know exactly the point at which the cycle has formed, down to the precise lines of code. Your stack trace at that moment shows both the effect function and how it got from there down to whatever code wrote to the already-read state.

This is very helpful in debugging, because usually such loops are usually entirely unintentional! Best practice for effects that write state is that such derived states should only ever be written to by one effect (or co-ordinated group thereof), with the writable form of the state not exposed to the rest of the system, and the readable form only read by other effects.

(So, if you somehow manage to make a loop while following these practices, it's because you made a mistake, and likely don't know where you made it.)

And this cycle detection approach works just fine with multiple effect schedulers, running their updates who-knows-when, so long as the system handles the write prevention, which is as simple as "if writing to a state whose lastCleanEpoch is the current epoch, and we're in a computation, throw an error". (And of course, the epoch must not be updated unless writing to a state whose lastCleanEpoch is the current epoch.)

Read-Side Cycle Detection

One minor limitation of this approach is that it only detects cycles during the effect run (i.e. at write time), even if the "real" culprit for the cycle is non-effect code reading a value between scheduler runs. (That is, if you've somehow exposed your effect-derived state to non-effect code.)

Luckily, this limitation doesn't keep the mechanism from still being useful in such cases, since you'll at least know which signal (state) has escaped containment, even if you'll still need to track down where and how.

The overall approach can in principle be extended to detect loops on the read-side as well, though I'm not proposing such at this time. This could be as simple as flagging the state as having been written by an effect, and then denying reads from non-effect code, or as complex as treating the effect as if it were a dependency, so that the effect can be re-run as part of "recalculating" the state when queried outside an effect.

But I'm not proposing any of those right now, because I haven't implemented them myself yet... at least not in JavaScript. The last time I worked with a similar set of signal algorithms was in the Trellis library for Python 16 years ago, and my memory on what I did there is a bit rusty!

(The Trellis allowed "maintenance rules" as a kind of effect that could update state in a safe way, even in the presence of apparent rule cycles, as long as they didn't have side effects outside the signaling system. They became part of the dependency graph for the state they updated, and could be run lazily when reads occurred. But the Trellis' way of doing that was very... Pythonic (in the "one right way to do it", theoretically-complete sense) and I would probably do it very differently now in JS, trading off some of the flexibility for simplicity and performance.)

Conclusions

Anyway, I just thought I should mention all this, since I believe that the slight change to timestamp handling (and checking it at write time) would allow very clean cycle detections on the "writing" side, even if it doesn't do anything about the "reading" side. (If at some point I also implement a workable read-side cycle detection approach in the library I'm working on, I'll report back if it seems potentially usable here.)

Looking forward to hearing your thoughts!

pjeby commented 18 hours ago

Closed this as it turns out my initial draft of this algorithm missed "phantom reads" -- i.e. situations where a signal's value has been depended on but not actually read because a computation depending on it didn't need to be recalculated. So it was sometimes allowing writes when it should not have, and producing missed reads instead of cycles.

In such situations the lastCleanEpoch of the upstream signal doesn't get updated, so I had to add a "phantom reads" array to my implementation to deal with that, and then when doing the "okay to write this inside an effect" check, I update the versions of nodes that were skipped during the upward sweep, or just empty the array if the timestamp changes. This allows there to be basically no overhead when effects are not writing to signals.

I still consider the cost worth it, since it allows effects to write values and easy debugging of unintentional cycles, but the correct implementation isn't O(1) per write, as I had assumed in the original proposal.