timgit / pg-boss

Queueing jobs in Postgres from Node.js like a boss
MIT License
2.14k stars 158 forks source link

Use with transactions in knex #18

Closed jordaaash closed 7 years ago

jordaaash commented 7 years ago

I realize this is a question rather than an issue with the library (which appears to be awesome, by the way!), so please feel free to close if this isn't the place to ask. With that said, I'm interested in publishing a job within a knex transaction.

With Bookshelf, this would allow saving a model in the transaction (and related models that may be saved within the same transaction), and also publishing jobs (if the transaction succeeds).

Since pg-boss seems to maintain its own connection(s) to the database directly, I don't know if this is even possible, but I imagine you have a better idea than I do about how it might be done if it is.

timgit commented 7 years ago

@jordansexton What if you were to wait for pg-boss's publish() resolve a jobId and then do your commit()?` Do you think the commit itself is going to throw?

jordaaash commented 7 years ago

Hmm, I think I understand what you're saying. So, I'm actually not worried about the commit itself throwing, but rather, one of the related models throwing an error (usually related to validation) when being created. The flow we have is something like this:

  1. Start a transaction.
  2. Create an Appointment, get an id
  3. Create an AppointmentStatus linking to appointmentId
  4. Creating an AppointmentStatus triggers a publish of a job with the appointmentStatusId
  5. Create a few AppointmentServices linking to appointmentId.
  6. Validation of one of the AppointmentServices fails with an error
  7. Rollback the transaction, but the job has already been created

Essentially the issue is that the publish call is a side effect that requires data derived during the transaction, so it can't be called in advance to wrap the whole thing. One possible solution might be:

  1. Perform steps 1-4 above.
  2. Get the jobId, and then within the knex transaction, create a Jobs model with the jobId.
  3. Perform steps 5-7 above.

Then, when the job runs, have it first looks for a Job with the matching jobId. If it doesn't find it, it means the transaction was rolled back, and the job should just return. This feels slightly janky, and I have no idea how it would work in a distributed context, but what do you think?

timgit commented 7 years ago

What if you were to publish a deferred job, say with startIn set to 10-20 seconds in the future, then just call boss.cancel() with that jobId if you decide to rollback?

jordaaash commented 7 years ago

That could work, though I think the problem with that would be that whatever is calling boss.cancel() would need to know:

  1. That there is a job (which it might not be expected to know, because job creation is a side effect that could be far removed from whatever fails)
  2. What the jobId is (which would then need to be passed around to anything that is inside the transaction)

However, I think deferring the job for a few seconds for the transaction to be committed is a great idea anyway, because even if following the method I proposed, the job could be run, and look for the corresponding Job by jobId, which wouldn't exist yet outside the transaction if the transaction isn't committed yet.

timgit commented 7 years ago

I wasn't suggesting that pg-boss would participate in the database transaction proper, but rather just behave in a transactional way after your more important items succeed or fail.

jordaaash commented 7 years ago

Got it. Thanks for your help!

timgit commented 7 years ago

You're welcome! Thanks for the feedback. Glad you're finding a good use out of it. It's not just me who likes it. 😁

jordaaash commented 7 years ago

We've been using Kue for a while now, and have had various issues with it. This looks like a great alternative for us, so I'm trying it out in a new branch for our API. I'll definitely let you know if we run into anything interesting!