camunda-community-hub / zeebe-client-node-js

Node.js client library for Zeebe Microservices Orchestration Engine
https://camunda-community-hub.github.io/zeebe-client-node-js/
Apache License 2.0
152 stars 38 forks source link

Worker thinks that it is at max capacity when jobBatchMinSize == maxJobsToActivate #216

Closed iamyume closed 2 years ago

iamyume commented 3 years ago

When jobBatchMinSize is the same number as maxJobsToActivate, I get the following messages in the logs

11:20:11.050 | zeebe |  [test (batch)] INFO: Worker at max capacity - test has 0, a capacity of 5, and a minimum job batch size of 5.
11:20:11.351 | zeebe |  [test (batch)] INFO: Worker at max capacity - test has 0, a capacity of 5, and a minimum job batch size of 5.

I believe this is due to the isOverCapacity check found in https://github.com/camunda-community-hub/zeebe-client-node-js/blob/953a93894406d6e55cdfcc1e6289599644f77831/src/lib/ZBWorkerBase.ts#L378

Expected Behavior

Worker should work on jobs and not be over capacity.

Current Behavior

Worker thinks that it is over capacity.

Possible Solution

Will probably need to have an additional check to see if those 2 values are the same.

Steps to Reproduce

  1. set jobBatchMinSize and maxJobsToActivate to be the same value > 0
  2. run a worker and view log messages saying that it is at max capacity

Context (Environment)

I can't set the number to be the same for both values. I have to make sure that jobBatchMinSize is less than maxJobsToActivate

Detailed Description

Possible Implementation

jwulf commented 3 years ago

This is actually technically correct - it is at maxCapacity in this scenario. I added this warning to give you a way to visually see when the worker was saturating, to help with performance tuning.

However, the semantic is slightly different for a Batch Worker, which is a concept that was introduced later.

I'm not sure about the best way to address this - maybe leave it as it is and make a note of it in documentation?

Do you have an idea on how you think it should work?

iamyume commented 3 years ago

is it really at max capacity if it is not actively working on any jobs? If the maxJobsToActivate == 5 & jobBatchMinSize == 5 the worker can be working on 0 jobs and will never be able to pick up any jobs since it will think that it is at max capacity.

If this is suppose to be technically correct, then adding documentation saying that maxJobsToActivate should be set to greater than jobBatchMinSize would suffice.

younes-io commented 3 years ago

is someone treating this issue or can I take it?

jwulf commented 2 years ago

is someone treating this issue or can I take it?

Go for it!

younes-io commented 2 years ago

is someone treating this issue or can I take it?

Go for it!

Will do, but before:

According to the docs,

I think the isOverCapacity formula should look like this:

const isOverCapacity = this.activeJobs >= this.maxJobsToActivate + this.jobBatchMinSize

That would mean that when the worker's activeJobs exceed the maximum number of "concurrent tasks AND jobs to fetch".

Do you agree with this definition of the "OverCapacity"?

I have created a worker, but I couldn't reproduce the logs message (even with a DEBUG level, nothing appears) with the worker max capacity. You can take a look at this and let me know what you think:

image

jwulf commented 2 years ago

This relates to the BatchWorker, so you need to createBatchWorker.

I checked the current master branch, and the code has changed significantly since this bug was opened. I don't know if it even has this problem now.

The way to test it would be to start heaps of processes (or one process with heaps of parallel tasks) then do*:

const w = zb.createBatchWorker({
    jobBatchMaxTime: Duration.seconds.from(120),
    jobBatchMinSize: 3,
    maxJobsToActivate: 8, 
    loglevel: 'INFO',
    taskHandler: async jobs => {
        const res = await Promise.all(jobs.map(job => job.complete()))
        done()
        return res
    },
    taskType: taskTypes['console-log'],
});

And see if it still complains with the new code.

Is building a reproducer for that something that you could do? If you do, please put it into a repo and link it here, and we can take a look at it.

*This code is based on this test: https://github.com/camunda-community-hub/zeebe-client-node-js/blob/801f69beadd66b2717f36869113e2802fe1c5838/src/__tests__/integration/API1/Worker-BatchWorker.spec.ts#L34-L45.

younes-io commented 2 years ago

According to the code the OP points to, the issue is related to Zeebe 0.26 Do you still maintain the 0.26 version ? As you said, the code changed a lot, and in the 1.0 version, the isOverCapacity does not exist anymore... Let me know how to get a Docker compose for 0.26 if you want me to test the 0.26 version

jwulf commented 2 years ago

I don't make changes to the 0.26 client anymore, so this will have to stay as it is. Thanks for taking a look at it, and I apologise for not picking the version up at the outset!