mojolicious / minion.js

:racehorse: Node.js high performance job queue
https://www.npmjs.com/package/@minionjs/core
MIT License
18 stars 2 forks source link

Bug: Minion._result can cause process killing exception #15

Closed DrMurx closed 3 weeks ago

DrMurx commented 1 month ago

Minion._result calls PgBackend.listJobs which may fail with an exception, but since _result might be called from a setTimeout, this exception may eventually kill the process.

The obvious solution would be to try-catch any exception in _result and then call the reject function passed into the method.

WDYT?

kraih commented 1 month ago

Probably would make sense. 🤔

DrMurx commented 1 month ago

@kraih Now that I'm thinking about it a bit more, it may actually be better to move the while result retrieval to the backend instead of polling it from Minion class. The backend implementation should have better knowledge about how to retrieve the result, and it may pick a more efficient method than polling - PgBackend could use the notification mechanism already used in the enqueue/dequeue logic.

kraih commented 1 month ago

It's a direct port of the Perl version, i think the original goal was not to make any changes in the backend for the feature if i remember correctly.

DrMurx commented 1 month ago

Time to emancipate 😏

kraih commented 1 month ago

If i remember correctly i actually tried a variant with notify and it did not perform well in this case. Not sure about specifics though, it's been some time.

DrMurx commented 1 month ago

Understood, and I don't aim for ditching the polling logic right now.

My train of thought was that pushing _result to PgBackend would help solving the initial problem I've reported, as it allows PgBackend to better deal with its own potential connectivity issues.

The efficiency gain is more a nice side effect - it would give the backend the option to wait for results more efficiently, if that's possible and desired. Whether an efficiency gain comes from any kind of notification mechanism or just from a simpler query to poll the jobs table is up to the backend.

kraih commented 1 month ago

Yea, i think if someone found an efficient solution without drawbacks i wouldn't mind adding a new backend method. Maybe the problem was that there were too many ways to set a result currently to send notifications efficiently? 🤔

kraih commented 1 month ago

Or it might have been that if there are many clients waiting for a result at the same time, every finished job wakes them all up at the same time, triggering a thundering herd problem.

DrMurx commented 1 month ago

Let's maybe focus on the immediate problem. Minion._result is an async method so it returns a Promise itself, even if it's getting passed resolve and reject from the promise created in Minion.result.

Both invocations of _result (here and here) are called without handling a rejection, which will inadvertently crash the application. This is a major flaw.

I'm not sure if my initial idea of just calling reject is correct. In the context of the promise returned from Minion.result, reject should indicate that the job we're waiting for has failed. So rejecting just because of a backend transport failure would break the current interface contract and can cause issues on the caller's side (the only expected rejection other than a failed job is the AbortError from the signal, if it's passed along at all)

Maybe the better approach is to handle a failure in listJobs by simply rerunning _result via setTimeout, just like the final else does.