How to run sequential jobs

marcuspoehls commented 7 years ago

Hi everyone,

I’m currently working on a project that heavily relies on background jobs and uses rethinkdb-job-queue to its heart. In my app, I’d love to run jobs sequentially and pass the result from a calculation on to the next job:

Job A -> Job B (use result of Job A) -> Job C (use result of Job B)

Currently, I use it like this: Job A is executed and once the result is available, I’m adding it to the job’s data and update the job within the database. Then, I’m listening for the completed event which gives me the following signature (queueId, jobId, isRepeating).

Whit that, I need to get the job again from the database, fetch the result data and create Job B with the result from Job A.

Do you have a better approach to run sequential jobs?
Do I miss someting so that I could pass a result directly with next to the completed event?

Thank you for your help!

grantcarthew commented 7 years ago

Hi Marcus.

Can't you just create the sequential jobs from inside the process handler?

For example:


q.process((job, next) => {

  return fancy.do(job.fancydata).then(() => {
    return q.addJob(q.createJob(moreFancyData))
  }).then(() => {
    next('All good mate!')
  }).catch yada yada
})

marcuspoehls commented 7 years ago

Hi Grant,

thanks for your help. That would be an idea.

Ah, I just recognized I've missed a point that might change the complete solution a bit. I'll give you more context so it's better to understand.

I'm using rethinkdb-job-queue for the provisioning part of Launch. To create a new server within Launch, the user fills a form and submits the data. In the backend, there are multiple steps to finish the complete provisioning

save new server to database
create droplet at DigitalOcean based on the selected size and region
use a delayed and repeating job to check if the droplet was created successfully. This may take some time
if the droplet was created successfully, use another job to fetch the droplet's IP address
start provisioning job

What I'm hesitant of is that each job would need to know its successor. I mean, it's just an architecture issue and logically the jobs are connected. There's just the idea in my head that each job should be independent and not know about other jobs.

And each job from the steps above has a different queue, so I have different files each defining the job's processing.

Do you think of a better solution that letting each job know about another job?
Do you think a "parent" handler keeping track of the jobs and listening on their events completed, failed to act properly (start next job or stop provisioning) is a good idea?

grantcarthew commented 7 years ago

@viridia had a similar issue #64 which ended up being better designed with a custom solution.

As you know there is no parent/child job relationships at the moment within rethinkdb-job-queue. I haven't put any effort into designing such a queue feature.

Options which come to mind are:

A single job which has all the data, or gets updated with all the data, to perform the whole task.
A Job A => Job B => Job C chained architecture as you have already described.
Multiple job queues in a parent/child design such as in #64. In this case the parent oversees the whole task.
A common name on multiple jobs using theJob Name and findJobByName feature.
Roll your own without using rethinkdb-job-queue.

Which options best suits your needs I can't really say.

There is another options. Add a feature to rethinkdb-job-queue?

How would a parent child feature work I wonder.

marcuspoehls commented 7 years ago

Thank you @grantcarthew for your ideas.

Well, I also don't know how to implement a parent-child feature for a job queue.

I really appreciate your help! Will test different setups and check which one fits my needs best. Having sequential jobs always requires some kind of logic to combine the different jobs. Now I'm doing it within the completed event, but I could also do it before calling next(…).

grantcarthew commented 7 years ago

You got me searching and thinking about it a little more. I think you have the right idea. The best option is to use the completed event or before the next() call.

Let me know what you come up with.

grantcarthew / node-rethinkdb-job-queue

How to run sequential jobs #70