camunda-community-hub / zeebe-client-node-js

Node.js client library for Zeebe Microservices Orchestration Engine
https://camunda-community-hub.github.io/zeebe-client-node-js/
Apache License 2.0
152 stars 38 forks source link

Exhaustive check on code paths in the job handler #210

Closed jwulf closed 3 years ago

jwulf commented 3 years ago

This is a proposed exhaustiveness check on the worker callback handler to ensure that all code paths take responsibility for the activated job.

At the moment, your IDE cannot warn you if you have code paths in the worker that result in the worker doing nothing until the broker times out the job activation. This breaking change forces the worker’s job handler to return the result of a call to the broker to complete, fail, or error the job, or signal that responsibility for this has been forwarded to an external system.

The IDE can then warn if there are code paths in the worker that don’t do one of these. It is a breaking change for existing code, as it changes the signature of the worker handler callback.

Feedback from existing users is welcome!

Motivation

At the moment you can write worker job callback handlers with a specific bug in them - code paths that do not call a completion method. The TypeScript language server cannot detect these errors, and thus workers with unintended behaviour can be written and pushed to production.

For example:

zbc.createWorker<any>({
    taskType: 'process-favorite-albums',
    taskHandler: (job, complete, worker) => {
        const { name, albums = [] } = job.variables
        worker.log(`${name} has the following albums: ${albums?.join?.(', ')}`)
        if (!name || !albums) {
            worker.log(`Missing required parameters!`)
        } else {
            // Do something with the payload
            complete.success()
        }
    },
    fetchVariable: ['name', 'albums'],
})

In the above example, we probably meant to call complete.error() when there are missing required payload variable values.

Another possible bug (I discovered this while refactoring tests to match this proposal):

zbc2.createWorker({
    loglevel: 'NONE',
    taskHandler: (_, complete) => complete.success,
    taskType: 'nonsense-task',
})

At the moment, the IDE cannot tell that I forgot to invoke the method.

For every job that is received, we expect that the worker handler will call one of the completion methods - whether to complete, fail, error, or forward the job.

Having the TypeScript language server detect code paths in the worker that do not complete the job would be a useful feature that avoids bugs in code.

Proposed Solution

We can detect if no completion handler has been called in the handler by creating a JOB_ACTION_ACKNOWLEDGEMENT symbol, like this:

export const JOB_ACTION_ACKNOWLEDGEMENT = 'JOB_ACTION_ACKNOWLEDGEMENT' as const
type JOB_ACTION_ACKNOWLEDGEMENT = typeof JOB_ACTION_ACKNOWLEDGEMENT
export type MustReturnJobActionAcknowledgement =
    | JOB_ACTION_ACKNOWLEDGEMENT
    | Promise<JOB_ACTION_ACKNOWLEDGEMENT>

These names help to provide a helpful type system error, like this:

Screen Shot 2021-03-05 at 7 25 28 PM

Then the callback handler signature goes from this:

export type ZBBatchWorkerTaskHandler<V, H, O> = (
    jobs: Array<BatchedJob<V, H, O>>,
    worker: ZBBatchWorker<V, H, O>
) => void

To this:

export type ZBBatchWorkerTaskHandler<V, H, O> = (
    jobs: Array<BatchedJob<V, H, O>>,
    worker: ZBBatchWorker<V, H, O>
) => MustReturnJobActionAcknowledgement[]

And we make the job action methods return the JOB_ACTION_ACKNOWLEDGEMENT symbol.

Now, in worker handler code you have to do this, for a standard worker:

zbc.createWorker<any>({
    taskType: 'process-favorite-albums',
    taskHandler: (job, complete, worker) => {
        const { name, albums = [] } = job.variables
        worker.log(`${name} has the following albums: ${albums?.join?.(', ')}`)
        if (!name || !albums) {
            worker.log(`Missing required parameters!`)
            return complete.error()
        } else {
            // Do something with the payload
            return complete.success()
        }
    },
    fetchVariable: ['name', 'albums'],
})

Challenges

zbc.createWorker<any>({
    taskType: 'process-favorite-albums',
    taskHandler: job => {
        const { name, albums = [] } = job.variables
        console.log(`${name} has the following albums: ${albums?.join?.(', ')}`)
        if (!name || !albums) {
            const token = job.error(`Missing required parameters in payload`)
            console.log('Do you really do something in a worker after a completion call?')
            return token
        }
        return job.complete()
    },
    fetchVariable: ['name', 'albums'],
})

The example is contrived, and maybe doing something in a worker after the completion is an anti-pattern / edge case.

A practical case where this must be done is in asynchronous Jest tests, where the done() callback must be called in the worker handler.

For example:

taskHandler: async jobs => {
      expect(jobs.length).toBe(10)
      await Promise.all(jobs.map(job => job.complete()))
      done()
},

needs to be rewritten as:

taskHandler: async jobs => {
    expect(jobs.length).toBe(10)
    const res = await Promise.all(jobs.map(job => job.complete()))
    done()
    return res
},
jwulf commented 3 years ago

Here is another example of a bug that this feature would catch, that I just found in the wild:

 if (listId === undefined) {
      complete.failure(`No list configuration found for ${listName}`);
    }
    try {
      await this.emailService.addToList(email, listId);
      complete.success();
    } catch (e) {
      this.logError(e);
      complete.failure(e.message);
    }

Here, we meant to return after the call to complete.failure(), rather than do the addToList side-effect.

With the proposed type check, the IDE would (be more likely to) catch this. It wouldn't catch it if I returned the complete.success but forgot to return the preceding complete.failure. See #213 for an approach to that.

jwulf commented 3 years ago

Changes merged for 1.3.0.

berndruecker commented 3 years ago

Hah - immediately stumbled over this when migrating my examples :-) But was good to google - so solved quickly :-)