vsivsi / meteor-job-collection

A persistent and reactive job queue for Meteor, supporting distributed workers that can run anywhere.
https://atmospherejs.com/vsivsi/job-collection
Other
385 stars 68 forks source link

.depends([job]) gives Exception while invoking method 'Play' Error: Each provided object must be a saved Job instance (with an _id) #171

Open ericvrp opened 8 years ago

ericvrp commented 8 years ago

Version 1.3.0 gives the following error when I try to use depends:

Exception while invoking method 'Play' Error: Each provided object must be a saved Job instance (with an _id) I20160409-12:21:17.624(2)? at Job.depends (packages/vsivsi_job-collection/job/src/job_class.coffee:607:23)

Whith the following code.

    var searchJob = new Job(myJobs, 'SearchTest', {
      time  : DEFAULT_SEARCH_TIME,
      play  : false,
      gameId: gameId,
      state : GameState(gameId)
    }).save();

    var playMoveJob = new Job(myJobs, 'PlayMoveTest', {
     searchJobRef: searchJob.data
    }).depends([searchJob]).save();

To me the above code looks alright. Any tips?

P.S. Instead of adding a dependent job to each 'SearchTest' job I would prefer to be able to add a job that depends on jobType (in addition to dependancy on a job).

vsivsi commented 8 years ago

Where are you running that code? job.save() is asynchronous, so unless you are running in a Fiber (on Meteor server), it is unlikely that searchJob will have an _id value yet when you try to make playMoveJob depend on it. I personally dislike depending on Fibers because it makes the code non-portable, so I usually write with callbacks. Your sample above is a good example of this, should work in a Fiber, broken otherwise...

vsivsi commented 8 years ago

You can look at this unit test for inspiration. It's coffeescript, but should be clear what is happening: https://github.com/vsivsi/meteor-job-collection/blob/master/test/job_collection_tests.coffee#L198

ericvrp commented 8 years ago

Coming from the Meteor world and not having written code the Node.js way before I completely overlooked the Fiber explanation in the documentation. This is what is what I ended up using:

    const searchJob = new Job(myJobs, 'Search', searchSettings);

    searchJob.save( function(error, searchJobId) {
      new Job(myJobs, 'PlayMove', {
        searchJobId: searchJobId,
        gameId: gameId
      }).depends([searchJob]).save();
    });

Should I open a feature request in the issue tracker for the 2nd part of the code? I would like to able to make a job depend on the result of a jobType.

vsivsi commented 8 years ago

Hi, I'm unclear what you mean by "depend on the result of a jobType"...

Does that mean creating a job that waits until any other job of a given type successfully completes? If so, that would be very complicated to implement correctly in a general way, and would probably have a million corner cases to worry about.

If you really want that functionality for a specific case, I would recommend implementing it on the server-side using the cursor.observe() functionality in Meteor.

      const searchJob = new Job(myJobs, 'Search', searchSettings);
      searchJob.save();

      // On the server do something like
      obs = myJobs.find({ type: 'Search', status: 'completed' })
         .observe({ added: function () {
                new Job(myJobs, 'PlayMove', {
                    searchJobId: searchJobId,
                    gameId: gameId
                }).save();
           }
      });
0o-de-lally commented 8 years ago

@vsivsi I'm also struggling with this, especially implementing with futures. If you recall I'm working on documentation for Job collection. Please see this link, is this how you would make the canonical example? It's adapted from your tests. https://github.com/keyscores/simplest_example_meteor_job_collection/blob/master/server/depends.js

I have a long chain of operations, and my code is slanting to the right further and further :) It would be nice to see your example for using futures.

Also, I'm wondering about the design of depends(). Is there another variable we can pass, instead of the full object, to bind antecedents/dependents?

vsivsi commented 8 years ago

Hi, I don't personally use promises/futures (or did you mean fibers?). I'm not opposed to them, I typically just end up using other ways of dealing with "the pyramid of doom", such as the async.js library, or event streams.

Right now, job.depends() requires a list of job objects. I suppose it could be extended to accept a list of job _id values. In that case, you can just as easily set job.doc.depends = [ <_id1>, <_id2>, ...] before running job.save(). That's essentially all that job.depends() does.

0o-de-lally commented 8 years ago

I think that extending would be a really good solution, seems more intuitive to me, and flexible.

BTW I'm running into strange random triggering with more than 3 dependencies. It's not following the dependent declarations. I'm trying to do chains like this:

enrichCurrencyJob.save( function(error, res){
});

enrichRegionJob.save( function(error, res){
  enrichRightIDJob.depends( [enrichRegionJob] )
});

enrichRightIDJob.save( function(error, res){
  enrichTaxJob.depends( [enrichRightIDJob] )
});

Any thoughts?

vsivsi commented 8 years ago

Yet more async programming basics...

enrichCurrencyJob.save( function(error, res){
});

enrichRegionJob.save( function(error, res){
  enrichRightIDJob.depends( [enrichRegionJob] )  // This probably runs After line noted below
});

enrichRightIDJob.save( function(error, res){   // This probably runs Before line noted above
  enrichTaxJob.depends( [enrichRightIDJob] )
});

The net result is that enrichRightIDJob is saved before its dependency is set.

vsivsi commented 8 years ago

What you probably intended...

enrichCurrencyJob.save( function(error, res){
});

enrichRegionJob.save( function(error, res){
  enrichRightIDJob.depends( [enrichRegionJob] ); 
  enrichRightIDJob.save( function(error, res){   
    enrichTaxJob.depends( [enrichRightIDJob] );
    // ...
  });
});
0o-de-lally commented 8 years ago

Yeah, that's the part I want to avoid, since I have a chain of 15 of those... It would be nice to have a single Future to tell me when I can update with the actual _id. Thoughts?

vsivsi commented 8 years ago

https://github.com/caolan/async#seriestasks-callback

0o-de-lally commented 8 years ago

That's interesting, but I'd really like to use Job Collection :) I'll try to add by id and see how that goes. Would still look forward to see depends() extended to also accept an id.

vsivsi commented 8 years ago

Hi, I think you misunderstand. You can use async.series to manage the callback pyramid when creating and saving your jobs. Even creating the dependent jobs is a series of async operations...

async.series([
   function (cb) {  enrichCurrencyJob.save(cb); },
   function (cb) {  enrichRegionJob.save(cb); },
   function (cb) {  
     enrichRightIDJob.depends( [enrichRegionJob] );
     enrichRightIDJob.save(cb);
   },
   function (cb) {
     enrichTaxJob.depends( [enrichRightIDJob] );
     enrichTaxJob.save(cb);
   }
  ], function (err, results) {
   if (err) {
      // handle it
   } else { 
     // do something with results
   }
});
0o-de-lally commented 8 years ago

Ok That makes sense, I was assuming I should call the one job with all the methods in the .series() . Let me try that. Thanks!

0o-de-lally commented 8 years ago

@vsivsi , The approach above works very well. There is an edge case, possibly unrelated.

Seems like when I have a heavy load on the DB I get an exception. I guess this is because the job object is not getting returned?

2016-05-05 17:03:19-04:00 Exception in setTimeout callback: Error: Each provided object must be a saved Job instance (with an _id)

Wonder if passing the _id string to depends() would prevent cases like this. Thanks in advance

vsivsi commented 8 years ago

I really can't say without seeing your actual code. It's all guesswork without the running code in front of me. The return value to the callback is the _id of the newly saved job, and should match the _id value in the job object. Something like:

job.save(function (err, idRet) {
   assert(job._id == idRet)
});

If that isn't true then something strange is happening.

0o-de-lally commented 8 years ago

Thanks, I'm working on a reproduction repo, but I can't seem to trigger this in local dev...

serkandurusoy commented 7 years ago

@vsivsi I would like to chime in here with what I think is the problem.

It seems that all the people who have commented here (including me) are using this packae within the context of meteor, and as by the official example on the readme suggests

// Create a job:
var sendJob = new Job(myJobs, 'sendEmail', {
  address: 'bozo@clowns.com',
  subject: 'Critical rainbow hair shortage',
  message: 'LOL; JK, KThxBye.'
}).save();

// Assuming synchronous style with Fibers...
// archiveJob will not run until sendJob has
// successfully completed.
var archiveJob = new Job(myJobs, 'archiveEmail', {
 emailRef: sendJob.data
}).depends([sendJob]).save();

we are all expecting sendJob to be a valid Job instance, whereas sendJob returns a String equivalent to the _id of the job.

Now this is confusing becuase the readme example (assuming fibers which we are all within the context of one, because meteor) suggests

As for callbacks and the current state of the javascript ecosystem / language features allow for synchronous-style programming already with promoted tools like promises, async/await or even by Meteor's synchronous style (through fibers), it should be normal that a lot of people expect to be able to use the return value as already shown in the example.

I mean no disrespect, and you are already going out of your way to help me and others with a lot of extra tools that we could only wish for, I humbly suggest you take this one as a bug-ish report.

In summary:

I would expect

Thank you!

vsivsi commented 7 years ago

@serkandurusoy Yup, that's a bug in the example code. Should be:

// Create a job:
var sendJob = new Job(myJobs, 'sendEmail', {
  address: 'bozo@clowns.com',
  subject: 'Critical rainbow hair shortage',
  message: 'LOL; JK, KThxBye.'
});

// Save it.
sendJob.save();

// Assuming synchronous style with Fibers...

// archiveJob will not run until sendJob has
// successfully completed.
var archiveJob = new Job(myJobs, 'archiveEmail', {
 emailRef: sendJob.data
}).depends([sendJob]).save();

Fixed in master.

serkandurusoy commented 7 years ago

Oh wow, it never occured to me that calling save() later would have a different outcome than chaining it!

Thank you for clarifying this and fixing the documentation!

vsivsi commented 7 years ago

Save terminates the chain because once you have saved the job, you need to pause it first in order to modify and save it again. That was intentional.

vsivsi commented 7 years ago

And also, job.save() is an async operation, so it can't be chained in a non-fiber environment.

serkandurusoy commented 7 years ago

Okay, I'll try this out and let you know asap

serkandurusoy commented 7 years ago

For what its worth, I think this issue should now be closed because I do think the OP's question is also answered by the fix on the readme example.