Implementing debounce with CAF?

getify / CAF

Cancelable Async Flows (CAF)

MIT License

1.34k stars 46 forks source link

Implementing debounce with CAF? #14

Closed keystroke closed 4 years ago

keystroke commented 4 years ago

[EDIT - this comment is in the context of the original title: "CAF.delay does not proxy cancel reason"]

Consider this example:

const token = new CAF.cancelToken();
CAF.delay(token.signal, 5000).then(console.log, console.error);
token.abort('TEST');

I would expect that console would log the error 'TEST' but instead it logs the error 'delay (5000) interrupted'.

Am I misunderstanding this behavior?

getify commented 4 years ago

It's true that token.abort(..) with a reason passed does typically reject the attached signal-promise with that reason value. But I don't think I contemplated people wanting the abort-reason to flow through the timeout cancellation as in this particular case. So as currently implemented, it does not... see here how onAbort(..) ignores any reason passed to it:

https://github.com/getify/CAF/blob/master/src/caf.src.js#L106-L110

It seems like a reasonable improvement to accommodate the reason flowing through (though technically it would be a breaking change).

I guess onAbort(..) would need a reason parameter added, with a default param value of the `delay (${ms}) interrupted` string value. So, if the abort reason is manually passed, that default is overridden. Would need to double-check if that simple solution works, but I think it should.

Willing to submit a PR?

keystroke commented 4 years ago

Thanks for the quick reply, and for the background info. I'm not sure if I actually need the behavior anymore. Maybe you could provide a bit of more general advice...

What I am actually trying to do with your library is create a "debounced" CAF to do a whole bunch of async work on the back of an input field validation. Below is what I have so far for a debouncedCAF version:

function debouncedCAF(generator, timeout, reason = 'Cancelled') {
    var token;
    return function (...args) {
        if (token) token.abort(reason);
        token = new CAF.cancelToken();
        return CAF.delay(token.signal, timeout).then(
            () => CAF(generator)
                .apply(this, [token.signal, ...args])
                .catch(e => { if (e !== reason) throw e; }),
            interruptedReason => console.debug('debouncedCAF', interruptedReason));
    }
}

It took several hours and iterations to arrive at above code lol. This stuff is hard!

I think above logic is correct, and for my use-case, I don't actually need to do anything in the "delay interruption callback" (where I am console.debug logging for now).

Have you thought about implementing a debouncedCAF? Is that abstraction / pattern still the right way to go about this in 2020? It took me a while to find the approach using generator functions and then to find your library. I spent a lot of time trying various different approaches with promises and signals and such... but using this CAF approach looks to be the best abstraction I've seen yet.

I'm using above function in combination with Vue.js for now, and this is what it is looking like:

const app = new Vue({
    el: '#app',
    data: {
        inputs: { a: '', b: '' }
    },
    watch: {
        'inputs.a': debouncedCAF(function* (signal, newVal, oldVal) {
            //signal.pr.catch(reason => { console.log('aborted'); });
            try {
                const response = yield fetch('https://contoso.com/' + this.inputs.a, {
                    signal: CAF.signalRace([signal, CAF.timeout(5000, 'Request timed-out.').signal])
                });
            } catch (error) {
                console.error('inputs.a', error);
            }
        }, 1000, 'Cancelled. inputs.a modified.')
    }
});

function debouncedCAF(generator, timeout, reason = 'Cancelled') {
    var token;
    return function (...args) {
        if (token) token.abort(reason);
        token = new CAF.cancelToken();
        return CAF.delay(token.signal, timeout).then(
            () => CAF(generator)
                .apply(this, [token.signal, ...args])
                .catch(e => { if (e !== reason) throw e; }),
            interruptedReason => console.debug('debouncedCAF', interruptedReason));
    }
}

So far, I only have one async operation to do in my callback (a fetch for now). Am I heading in the right direction do you think?

The next step for me is to create a helper function (fetchWithTimeout) so I can better "merge" signal cancellations with fetch calls. I need to be able to pass a custom timeout to the fetch calls, so I need fetch internally to abort when either the timeout has elapsed, or if the entire operation is cancelled. I still want the timeout error to propagate to my catch block, but not the cancel error. It looks like above is working.

Is above code looking like what you would expect for a valid use of your library? Am I overlooking anything to make this more elegant?

Thanks, appreciate your time and work on this library!

keystroke commented 4 years ago

I'm particularly uncertain about the catch block in the debouncedCAF function:

catch(e => { if (e !== reason) throw e; })

Is that the correct way to detect if the error was due to being cancelled? I could omit this, and just let the cancellation error be thrown back into the CAF, but then the catch block of the CAF would need to have similar code to detect if the error was "real" (e.g. bad input) vs. the operation being cancelled. Using a string seems weak, since some async operation I call in the CAF could throw its own error with the same string. That makes me think using a new object for the cancellation reason would be best since the === operator would ensure the cancellation was due to my debounce reset.

There is the option of setting a callback on the signal.pr.catch in the CAF to do custom work on cancellation, so I felt fine not otherwise-exposing the CAF to the debounce cancellation by re-throwing the error. But I'd still prefer a more reliably way to cancel than relying on a string comparison to detect the cancellation...

It also occurred to me that maybe the debounceCAF should have input to decide whether to cancel currently-running operations, versus just throttling them. This debouncedCAF is more like a singltonDebouncedCAF... haha.

WHAT IS THE ABSTRACTION I AM AFTER?! lol appreciate any thoughts or comments on this topic. I'll update the title.

keystroke commented 4 years ago

Sorry for spamming so many comments. Just feel alone in this journey.

Actually it looks like there are more issues with my code. The catch block I implicated above actually just ensures the cancellation error is not unhandled, and just ignores it. But if the generator function passed into the CAF were to throw an error, this code will allow that to pass-through as an unhandled exception (unless that error is a string that matches the cancellation reason).

That made me look at this function:

async function test(timeout = 1000) {
    try {
        const token = new CAF.cancelToken();
        const response = await fetch('https://contoso.com', {
            signal: CAF.signalRace([token.signal, CAF.timeout(timeout, 'Timeout').signal])
        });
        token.abort('abort');
    } catch (error) {
        console.log(error);
    }
}

The catch block error is just the generic fetch abort error in above scenario. It seems like in the catch block, you have to check the wrapped signals to see which one is actually aborted.

So correction would be like this:

async function test2(timeout = 1000) {
    try {
        const abortToken = new CAF.cancelToken();
        const timeoutToken = CAF.timeout(timeout, 'Timeout');
        const response = await fetch('https://contoso.com', {
            signal: CAF.signalRace([abortToken.signal, timeoutToken.signal])
        });
        token.abort('abort');
    } catch (error) {
        if (abortToken.signal.aborted) {
            console.log('Aborted!');
        } else if (timeoutToken.signal.aborted) {
            console.log('Timeout!');
        } else {
            console.error(error);
        }
    }
}

I will keep playing around with the code snippets and see where I get. I'll avoid commenting back with more noise unless I find a final solution or hit a more focused error or something.

getify commented 4 years ago

Seems like a pretty reasonable approach to using CAF for this debouncing purpose... I only spotted a bit of inefficiency in re-creating the arrow function each time a new timer token is created, as well as re-wrapping the generator in the CAF(..) call each time. Here's a tweaked version that only does those tasks once and uses the cached results in subsequent calls:

function debouncedCAF(generator, timeout, reason = 'Cancelled') {
    var fn = CAF(generator);
    var runFn;
    var token;

    return function (...args) {
        if (token) token.abort(reason);
        if (!runFn) {
            runFn = () => fn
                .apply(this, [token.signal, ...args])
                .catch(e => { if (e !== reason) throw e; });
        }
        token = new CAF.cancelToken();
        return CAF.delay(token.signal, timeout).then(
            runFn,
            interruptedReason => console.debug('debouncedCAF', interruptedReason));
    };
}

I'm particularly uncertain about the catch block in the debouncedCAF function:

catch(e => { if (e !== reason) throw e; })

Is that the correct way to detect if the error was due to being cancelled?

It's fine as-is, but if you wanted to make it slightly more robust, you might create a specific Error sub-class, and check that the exception received is of that type or not:

class DebounceCancelled extends Error {}

function debouncedCAF(generator, timeout, reason = new DebounceCancelled()) {
    // ..
    .catch(e => { if (!(e instanceof DebounceCancelled)) throw e; });
    // ..

keystroke commented 4 years ago

Thanks for the updated version! Especially consideration on the performance. I hadn't gotten that far but it makes perfect sense.

Regarding the exception detection, shouldn't I keep the instance equality check?

class DebounceCancelled extends Error {}

function debouncedCAF(generator, timeout, reason = new DebounceCancelled()) {
    // ..
    .catch(e => { if (e !== reason) throw e; });
    // ..

My concern is around nested cancelled functions, or code that would "cheese" it like this:

const myFunc = debouncedCAF(function* (signal, input) {
    throw new DebounceCancelled();
}, 1000);

If above example, the debouncedCAF would re-throw that exception because it was a different instance, even though it was the same type. If I used strings or type checks, I wouldn't be able to tell, and it would just ignore the error. But I'm not 100% certain that it matters. Definitely seems like a pathological scenario.

Although, what happens if I start nesting debouncedCAF function calls?

const myFunc1 = debouncedCAF(function* (signal, input) {
    // ..
}, 1000);

const myFunc2 = debouncedCAF(function* (signal, input) {
    return yield myFunc1(signal);
}, 2000);

myFunc2();
myFunc1();
myFunc2();

I can't quite wrap my head around the nested example at the moment. In above example, when I call jsmyFunc1(); it looks like the cancellations will trigger all the way through. But then when I call jsmyFunc2(); for the second time, would it cancel itself because the previous call to myFunc1 gets cancelled?

Seriously, thank you so much for your help so far and for making this library. You have no idea how long it's been to get to this point. I've just been casually messing-around with JS for last ~6 months, and I haven't been able to get passed ~1000 LOC without the project just completely falling apart in chaos and digging worms out of my eyeballs. It's surprising how conceptually dense <100 lines of code can be in these kinds of helper functions. And coming from a c# background where async and cancellation tokens are the standard and ready to go out of-the-box, it's been painful trying to wire this up. I am worried that I am just programming in the wrong paradigm completely. But seeing the generator pattern thing gives me hope!

getify commented 4 years ago

shouldn't I keep the instance equality check?

Depends on if you want to let people pass in their own cancellation reasons (maybe with their own message in it) which are a legit instance of that class, or if you're going to keep that class hidden so only your tool can create them.

If the former, where it's publicly exposed, the instanceof check is necessary. If you're keeping it private, you could actually just create a single hidden shared value, like an empty object, that served as an internal "brand check" that you could verify only you would be able to provide (verified with the === check as you suggest).

throw new DebounceCancelled();

If someone went out of their way to throw one of these, I would say you would probably want to treat it as a cancellation that they manually triggered (instead of calling token.abort()).

getify commented 4 years ago

return yield myFunc1(signal);

I can't quite wrap my head around the nested example at the moment.

It's a bit tricky, but let me see if I can help you walk through it.

Look at the current debounceCAF(..) utility implementation:

return function (...args) {
    // ..
    .apply(this, [token.signal, ...args])
    // ..
}

That function is the myFunc1(..) being invoked in this line:

return yield myFunc1(signal);

You're passing signal in as the argument, but that signal is not what this signal parameter will receive:

const myFunc1 = debouncedCAF(function* (signal, input) {

The reason is, .apply(this, [token.signal, ...args]) is forcibly already setting that function's debounce-cancellation token as the first argument. So the nested signal you're trying to pass along will actually come in as the second parameter, which will be incorrect, since your function is expecting input for that parameter, not a nested cancellation signal.

In other words, as your current implementation is conceived, the "nested" calls to separately debounced calls are NOT chained (where one cancels the other). That actually might be the better desired behavior, rather than having one magically resetting the other. You'd need to construct some concrete use-cases of nested debounce calls to figure out if it's desired or not.

But to explicitly answer your question, no, the debounce cancellation of one would not propagate automatically to the other nested call.

OTOH, by nature of the fact that each call to a debounced function does cancel its previous pending invocation, just the very call to myFunc1(..), even without trying to pass along the nested signal, is in fact propagating reset/cancellation to that nested call, of sorts. :)

keystroke commented 4 years ago

@getify going back to previous example, I played around a bit and realized that we do need to re-create the run function in the debouncedCAF definition each time we are called... or find a way to re-bind the input value.

Below example logs 1 to console instead of 2:

const myFunc = debouncedCAF(function* (signal, input) {
    console.log(input);
}, 1000);

myFunc(1);
myFunc(2);

Regarding the example of nested debounce calls... trying it out, I do see that the signal is not passed to child debouncedCAFs, and it instead gets passed as the first input. Makes sense working through it.

I also made below example, and verified that when I call myFunc1 which myFunc2 is already waiting-on,

const myFunc1 = debouncedCAF(function* (signal, input) {
    console.log('func1 Called', input);
}, 2000);

const myFunc2 = debouncedCAF(function* (signal, input) {
    console.log('func2 Start', input);
    yield myFunc1('Called by myFunc2 ' + input);
    console.log('func2 Done', input);
}, 1000);

function runTest() {
    myFunc2(1);
    setTimeout(() => myFunc1(2), 1100);
}

Above example traces this to the console:

func2 Start 1
debouncedCAF delay (2000) interrupted
func2 Done 1
func1 Called 2

So in this case, myFunc2 is "waiting" on myFunc1, when myFunc1 is cancelled, what should myFunc2 do? In this case, it continued executing.

I don't even know conceptually what makes sense in this scenario. More realistically, I will only have a single top-level debouncedCAF, and the idea would be that internally it might be calling other non-debounced CAF functions and such.

In below example, just like above, I observe that myFunc2 continues execution after myFunc1 gets cancelled:

function debouncedCAF(generator, timeout, reason = 'Cancelled') {
    var token;
    return function (...args) {
        if (token) token.abort(reason);
        token = new CAF.cancelToken();
        return CAF.delay(token.signal, timeout).then(
            () => CAF(generator)
                .apply(this, [token.signal, ...args])
                .catch(e => { console.debug(e); if (e !== reason) throw e; }),
            interruptedReason => console.debug('debouncedCAF', interruptedReason));
    }
}

const myFunc1 = debouncedCAF(function* (signal, input) {
    console.log('func1 Called', input);
    throw 'Cancelled';
}, 2000);

const myFunc2 = debouncedCAF(function* (signal, input) {
    console.log('func2 Start', input);
    yield myFunc1('Called by myFunc2 ' + input);
    console.log('func2 Done', input);
}, 1000);

myFunc2(1);

However, when I change the declaration to assign unique objects to the "reason" parameter, I see that it does not continue executing myFunc2 when myfunc1 is cancelled... because it throws the cancellation as {} === {} returns false:

const myFunc1 = debouncedCAF(function* (signal, input) {
    console.log('func1 Called', input);
    throw 'Cancelled';
}, 2000, {});

const myFunc2 = debouncedCAF(function* (signal, input) {
    console.log('func2 Start', input);
    yield myFunc1('Called by myFunc2 ' + input);
    console.log('func2 Done', input); // this does not execute
}, 1000, {});

myFunc2(1);

But, above results in an unhandled exception propagating to the caller of myFunc.

So... if we want the behavior that nested debounce functions chain their cancellations, I tried messing-around a bit, but I can't seem to work-out a way to make this work. Only approach I can see would be to just re-throw the cancellation error, and then force callers to decide what to do in a catch block...

getify commented 4 years ago

The more I think about it, the less the "nested (or chained) debouncing" call makes any sense. I haven't been able to come up with any reasonable use-cases for that yet.

If you step back and think about it, if one debounced function A is calling another debounced function B, the B function isn't being invoked (to start its timer) unless A has had a chance to fully complete its debouncing-delay-timer and actually been executed. So they're running in sequence, not in series/parallel.

Two scenarios to consider:

A has a debounce interval of 300ms and B has a debounce interval of 200ms. In this scenario, B's debounce interval being shorter than A's means that B's debouncing is moot. It delays, but it could just as easily be a setTimeout, because B would never be invoked more often than its interval since A is running less frequently than that.
A has a debounce interval of 200ms and B has a debounce interval of 300ms. In this scenario, B is rarely ever going to execute. If A is happening repeatedly, it runs early enough to reset B perpetually. This has the effect of indefinitely deferring B, until A stops happening long enough, and B will happen once at the end.

IMO, neither of these scenarios is a particularly useful application of debouncing (for B, anyway). Scenario 1 isn't a debouncing B at all, it's just a sequential timer deferral. And Scenario 2 is, strictly speaking, a debouncing of B, but with an unbounded window of deferral, which would only rarely be useful as far as I can tell.

getify commented 4 years ago

I think we'll close this for now. If there's anything left for CAF to address, feel free to comment again.