thefrontside / effection

Structured concurrency and effects for JavaScript
https://frontside.com/effection
MIT License
592 stars 25 forks source link

"finally" cleanup code not guaranteed to run in the correct order #899

Closed adrusi closed 6 months ago

adrusi commented 6 months ago

Overview

If a parent task exits while a child task is still running, the effection runtime will halt the child task, in order to provide structured concurrency guarantees.

However, when a generator function returns, the javascript runtime will run its finally cleanup code before execution is returned to the scheduler. This means that cleanup code in the parent has already been run before the effection runtime has a chance to halt the child task.

That's a surprising and subtle departure from programmer expectations, and a violation of structured concurrency guarantees, since code from the child task is still running after the parent task has completely exited.

This gotcha should be documented, and possibly the effection API should be designed to make impossible.

Example

Code shared across all examples:

import * as e from "effection"

class Resource {
    #isClosed = false

    close() {
        console.log("closing resource")
        this.#isClosed = true
    }

    performOperation(note: string) {
        if (this.#isClosed) throw new Error("Resource is closed")
        console.log(`performing operation: ${note}`)
    }
}

function* child(res: Resource): e.Operation<void> {
    try {
        console.log("child pre-sleep")
        yield* e.sleep(1000)
        console.log("child post-sleep")
    } finally {
        console.log("child finally")
        res.performOperation("child")
    }
}

If the programmer relies on the auto-halting of child tasks by the effection runtime, and uses try...finally for cleanup code, then you can get use-after-close errors if a child task's cleanup code uses a resource owned by its parent.

function* bad() {
    let res = new Resource()
    try {
        res.performOperation("bad")
        const barTask = yield* e.spawn(() => child(res))
        yield* e.sleep(500)
    } finally {
        console.log("bad finally")
        res.close()
        console.log("bad is over")
    }
}

await e.main(function* () {
    try {
        yield* e.call(bad)
    } catch (e) {
        console.error(e)
    }
})
performing operation: bad
child pre-sleep
bad finally
closing resource
bad is over
child finally
 7 |         console.log("closing resource")
 8 |         this.#isClosed = true
 9 |     }
10 | 
11 |     performOperation(note: string) {
12 |         if (this.#isClosed) throw new Error("Resource is closed")
                                       ^
error: Resource is closed

With the current effection API, there are two ways to avoid the problem.

The first is to explicitly halt child tasks before the parent returns:

function* good1(): e.Operation<void> {
    let res = new Resource()
    try {
        res.performOperation("good1")
        const barTask = yield* e.spawn(() => child(res))
        try {
            yield* e.sleep(500)
        } finally {
            yield* barTask.halt()
        }
    } finally {
        console.log("good1 finally")
        res.close()
    }
}

await e.main(function* () {
    yield* e.call(good1)
})
performing operation: good1
child pre-sleep
child finally
performing operation: child
good1 finally
closing resource

And the other is to use ensure for cleaning up resources instead of try...finally:

function* good2(): e.Operation<void> {
    let res = new Resource()
    yield* e.ensure(() => {
        console.log("good2 ensure")
        res.close()
    })

    res.performOperation("good2")
    const barTask = yield* e.spawn(() => child(res))
    yield* e.sleep(500)
}

await e.main(function* () {
    yield* e.call(good2)
})
performing operation: good2
child pre-sleep
child finally
performing operation: child
good2 ensure
closing resource

Avoiding the problem

This problem boils down to the non-equivalence of lexical scope and task-execution scope in effection's design. You avoid it by either making task-execution scope match lexical scope by explicitly halting (or joining) child tasks, or making cleanup code tied to task-execution scope instead of lexical scope, by using effection runtime functions.

It's unfortunate that moving cleanup code to task-execution scope is the more ergonomic option in effection, since task-execution scope is invisible and harder to reason about. When you use ensure, the point at which your cleanup code runs depends on whether the operation was invoked using yield* myOperation() or yield* call(myOperation).

Ideally, there would be no such concept as task-execution scope, with the API designed so as to make it always match some lexical scope. Trio, a structured concurrency runtime for python, achieves this through strict limitations on when child tasks are joined. It'd be feasible to copy its design by replacing spawn with:

yield* withNursery(function* (nursery) {
    nursery.startSoon(() => myOperation1())
    nursery.startSoon(() => myOperation2())
})

Another option would be to ship a linter that ensures child tasks are all explicitly halted or joined in the same scope where they're created.

cowboyd commented 6 months ago

@adrusi You've definitely come across a subtle difference, but Effection does enforce a strict guaranteed order of shutdown, albeit not the one you might have been expecting. However, there is a rationale for why a task's body completes execution before the resources it allocates are torn down.

In essence, Effection guarantees not only that a spawned task will be halted after its parent exits, but it also (and this is critical for this point) guarantees that the spawned task will be alive throughout the execution of its parent's body. This is necessary in case the caller is relying on the spawned task to provide side-effects up-to and including its final statement, and doubly so in the case of a resource which could be used up-to and including the very last statement in its very last finally block.

The very moment that a task has completed its final statement is the moment when we can be confident that all of its spawned tasks and resources are no longer be needed, and can be halted immediately.

Effection provides a resource() API for precisely the use case to which you're referring. We could write your resource like this:

export function useResource(): Operation<Resource> {
  return resource(function*(provide) {
    let resource = new Resource()
    try {
      yield* provide(resource);
    } finally {
      resource.close();
    }
  });
};

await main(function* () {
  let resource = yield* useResource();
  yield* spawn(() => child(resource));
    yield* sleep(500);    
});

So Effection guarantees that resource will be alive throughout its body (but not a moment later). Furthermore, it guarantees that resources and tasks are destroyed in the exact reverse order in which they were created so as to ensure that no dead references are held. So in the example above, the spawned child() will be halted first, and then resource will be halted.

cowboyd commented 6 months ago

I'm going to go ahead and close this, but feel free to follow up with any questions or comments either here or in our Discord