Support for incrementally computed signals, pursue history instead of dirty flags

anglinb commented 3 weeks ago

We use signia comprehensively for managing state in our website editing tool. It's great and follows the API laid out by many other signals libraries except one incredibly useful tool called "Incrementally computed signals".

Their documentation provides the best explanation but rather than using an "is_dirty" flag, you can keep a history buffer so a computed value can be updated by iterating through the changes to the signal rather than having to re-compute from scratch.

Additionally, you can control the size of buffer to keep depending on how often the computation is listened to. If the computation is listened to always, its more efficient than is_dirty b/c of the incremental nature. If it's never listened to, it only takes up to a fixed size of changes so it is memory bounded and if it is listened to somewhat frequently, it balances the tradeoffs of loosing your place by throwing away the computation every time and keeping a huge amount of history to always be able to re-compute the value from the patches. It's a dangerous but incredibly powerful tool for dealing with intensive computed values.

Pulling from their documentation

Let's say you have a list of names const names = ['Steve', 'Alex', 'Lu', 'Jamie', 'Mitja'] and we want to compute the reverse of each value. In the current proposal, the best we could do is to simply re-compute the whole list every time. Basically computed = names.map(name => name.split('').reverse().join('')) Depending on the size of that list and the computational intensity of inner map function that can be incredibly costly. Think about 10,000 items and computing the hamming distance against some fixed value. Making the entire collection dirty is incredibly wasteful if we only change a few items at a time.

Let's say we're able to capture the changes to the original signal with something more granular like immer. So now rather than just "something changed" we're getting "patch" which specifies exactly what change:

import { Patch, produceWithPatches, enablePatches } from 'immer'
import { Atom, atom } from 'signia'

enablePatches()

class ImmerAtom<T> {
    // The second Atom type parameter is the type of the diff
    readonly atom: Atom<T, Patch[]>
    constructor(name: string, initialValue: T) {
        this.atom = atom(name, initialValue, {
            // In order to store diffs, we need to provide the `historyLength` argument
            // to the atom constructor. Otherwise it will not allocate a history buffer.
            historyLength: 10,
        })
    }

    update(fn: (draft: T) => void) {
        const [nextValue, patches] = produceWithPatches(this.atom.value, fn)
        this.atom.set(nextValue, patches)
    }
}

Now we can implement our own algorithm to selectively apply the patches we generated to the output of our signal.

import { Draft } from 'immer'
import { RESET_VALUE, withDiff } from 'signia'

function map<T, U>(source: ImmerAtom<T[]>, fn: (value: T) => U): Computed<U[], Patch[]> {
    return computed(
        source.atom.name + ':mapped',
        (prev, lastComputedEpoch) => {
            // we need to check whether this is the first time we're running
            if (isUninitialized(prev)) {
                // if so, just map over the array and return it
                return source.atom.value.map(fn)
            }

            // this is not the first time we're running, so we need to calculate the diff of the source atom
            const diffs = source.atom.getDiffSince(lastComputedEpoch)
            // if there is not enough history to calculate the diff, this will be the RESET_VALUE constant
            if (diffs === RESET_VALUE) {
                // in which case we need to start over
                return source.atom.value.map(fn)
            }

            // we have diffs and a previous value
            const [next, patches] = produceWithPatches(prev, (draft) => {
                // apply the upstream diffs while generating a new set of downstream diffs
                for (const patch of diffs.flat()) {
                    const index = patch.path[0]
                    if (typeof index !== 'number') {
                        // this will be array length changes
                        draft[patch.path[0] as 'length'] = patch.value as number
                        continue
                    }
                    if (patch.op === 'add') {
                        if (patch.path.length === 1) {
                            // this is a new item in the array, we need to splice it in and call the map function on it
                            draft.splice(patch.path[0] as number, 0, fn(patch.value) as Draft<U>)
                        } else {
                            // one of the existing items in the array has changed deeply
                            // let's call the map function on the new value
                            draft[index] = fn(source.atom.value[index]) as Draft<U>
                        }
                    } else if (patch.op === 'replace') {
                        // one of the existing items in the array has been fully replaced
                        draft[index] = fn(patch.value) as Draft<U>
                    } else if (patch.op === 'remove') {
                        next.splice(index, 1)
                    }
                }
            })

            // withDiff is a helper function that returns a special value that tells Signia to use the
            // provided value and diff
            return withDiff(next, patches)
        },
        {
            historyLength: 10,
        }
    )
}

Now we can see the results of making a single update:

const names = new ImmerAtom('names', ['Steve', 'Alex', 'Lu', 'Jamie', 'Mitja'])

let numReverseCalls = 0
const reversedNames = map(names, (name) => {
    numReverseCalls++
    return name.split('').reverse().join('')
})

console.log(reversedNames.value) // [ 'evetS', 'xelA', 'uL', 'eimaJ', 'ajtiM' ]
console.log(numReverseCalls) // 5

// Then later we update the array to include "David" at the end 

names.update((draft) => {
    draft.push('David')
})

console.log(reversedNames.value) // [ 'evetS', 'xelA', 'uL', 'eimaJ', 'ajtiM', 'divaD' ]
console.log(numReverseCalls) // 6

// Then we update
names.update((draft) => {
    draft[0] = 'Sunil'
})

console.log(reversedNames.value) // [ 'linuS', 'xelA', 'uL', 'eimaJ', 'ajtiM', 'divaD' ]
console.log(numReverseCalls) // 7

// Then we remove
names.update((draft) => {
    draft.pop()
})

console.log(reversedNames.value) // [ 'linuS', 'xelA', 'uL', 'eimaJ', 'ajtiM' ]
console.log(numReverseCalls) // 7

In short, the time complexity of the computed value can be bounded to the time complexity of the updating the original signal. With the proposal as it is today, the time complexity of the computed value is bounded by whatever a full re-computation takes.

The current proposal is almost exactly the same as what signia provides with no history length. Basically every change returns a RESET_VALUE which triggers a full re-computation.

dead-claudia commented 2 weeks ago

Worth noting history has a significant runtime memory cost, and unbounded history is a recipe for disaster for many use cases.

NullVoxPopuli commented 2 weeks ago

I confirm that managing history very quickly explodes memory. I tried it with a 1 dimensional reactive list of boolean values, and it very quickly starts to make the browser slow when (XY) cells 60 (attempted) fps -- was trying to visualize history in https://game-of-life.nullvoxpopuli.com/ :sweat_smile: -- I eventually decided to cap history at 20 steps backwards -- which is still very slow, but not as slow (infinitely increasingly more and more slow and unboundary history can be) - this is a tradeoff -- As far as the proposal is concerned, I think it possible to implement in userland though. Could do some import.meta.env.DEV guarding or something like that.

tc39 / proposal-signals

Support for incrementally computed signals, pursue history instead of dirty flags #190