FirebaseExtended / firebase-queue

MIT License
786 stars 108 forks source link

Mutate the task data when job fails #34

Closed dreadjr closed 8 years ago

dreadjr commented 8 years ago

I am using the firebase queue to submit to external systems. The problem i am running into if the task fails half way through, some have been successfully submitted and others have not, i am unable to update the task, either through reject or progress to remove parts of the task to avoid duplicate requests.

I could submit to another firebase queue, which does the submittal process but i wanted to get your thoughts on that as well, since it seems to have the same problems.

cbraynor commented 8 years ago

I'd recommend in this case that you split you tasks into smaller sub-tasks that you chain together to achieve your overall goal - that way you can tune how long a task is supposed to take and how many times it retries, and resume halfway through without repeating work

dreadjr commented 8 years ago

The problem is the number of tasks is dynamic. Unless I am misunderstanding.

On Wednesday, September 30, 2015, Chris Raynor notifications@github.com wrote:

I'd recommend in this case that you split you tasks into smaller sub-tasks that you chain together to achieve your overall goal - that way you can tune how long a task is supposed to take and how many times it retries, and resume halfway through without repeating work

— Reply to this email directly or view it on GitHub https://github.com/firebase/firebase-queue/issues/34#issuecomment-144622432 .

cbraynor commented 8 years ago

Ah, I see, that's more challenging. Do your tasks need to be executed in series, or can they be executed in parallel? If they can be run in parallel, I'd suggest a fan-out task as the ingress point to the queue that simply adds the required number of sub-tasks to the queue in one update() command on the tasks ref, and then resolves (with no finished_state so it gets removed). The sub-tasks can then be consumed by a separate pool of workers. If they need to be run in series, it's a little more complicated. You could take a look at the transaction logic in Firebase Queue itself that ensures a worker only mutates a task that it currently owns and then pass the ID of the task as a parameter to mimic that, or conversely you could have a large number of almost identical tasks that short-circuit and resolve if all the work has been done, but there's no in-built way for now to deal with a variable number of chained tasks

dreadjr commented 8 years ago

thanks @drtriumph

They can be run in parallel, are you suggesting uses 2 queues, one for the master task and one for the target of the fan-out task? Or a single queue?

cbraynor commented 8 years ago

Something like this:

var Queue = require('firebase-queue'),
    Firebase = require('firebase'),
    _ = require('lodash');

var ref = new Firebase('https://<your-firebase>.firebaseio.com/queue');

var taskSpecs = {
  fanout: {
    in_progress_state: 'fanning_out',
    error_state: 'fanout_error'
  },
  hard_work: {
    start_state: 'start_work',
    in_progress_state: 'working',
    finished_state: 'work_done',
    error_state: 'work_error'
  }
}

var fanoutQueue = new Queue(ref, { specId: 'fanout' }, function(data, progress, resolve, reject) {
  var newTasks = {};
  for (var i = 0; i <= data.numJobs; i++) {
    newTasks[ref.push().key()] = _.assign({ jobId: i, '_state': 'start_work' }, data);
  }
  ref.child('tasks').update(newTasks);
});

var processingQueue = new Queue(ref, { specId: 'hard_work' }, function(data, progress, resolve, reject) {
  // Do work here
});

Then you can push onto the queue something like:

ref.child('tasks').push({
  some: 'data',
  numJobs: 10
});