vsivsi / meteor-job-collection

A persistent and reactive job queue for Meteor, supporting distributed workers that can run anywhere.
https://atmospherejs.com/vsivsi/job-collection
Other
385 stars 68 forks source link

Job Progress and the DB record #195

Closed gary-menzel closed 8 years ago

gary-menzel commented 8 years ago

Hello,

Just getting across all this great functionality and I think I might be confused about Job Progress.

Basically I have an array of items in the data for the job which I am looping over and I am updating the progress (count, total) and also echoing it to the console. The whole job then completes OK but the "progress" object in the completed job record only shows:

completed:1, total:1, percent: 100.

When I expected it to be:

completed:5, total:5, percent: 100

Is it supposed to be updated each time I call Job Progress or have I misunderstood?

Looking at the code, it did not appear to be writing anything back to the collection.

My testing does not yet have any long-running operations but could do in the future and I was hoping to be able to show the progress reactively by subscribing to the collection (think of a transaction system where a file is uploaded daily to apply and there could be several thousand transactions that the job would handle).

vsivsi commented 8 years ago

Hi, thanks for your question. The progress should absolutely be updating, but from your description it looks as though maybe your code isn't correctly setting the total parameter in each call.

For a working example, I suggest you look at the playground sample app, specifically the client code that handles the "work": https://github.com/vsivsi/meteor-job-collection-playground/blob/master/play.coffee#L74-L98

Or, the equivalent code in the standalone node.js worker: https://github.com/vsivsi/meteor-job-collection-playground-worker/blob/master/work.coffee#L57-L73

If you have more questions and can show me the code that is causing you trouble, I'd be happy to take a look.

gary-menzel commented 8 years ago

Thanks for the prompt response. I am working in a separate job at the moment and wont be able to check the actual code for another 6 hours or so (on a machine in a different location). However, essentially I am looping over an array and setting the CURRENT to the loop index (+1) and the TOTAL to the length of the array. When I set the {echo:true} option, the console log is presenting correctly on each iteration through the loop.

So, given the echo is executing, this suggested the parameter validation early in the progress call is passing and at least getting to that line.

I also reviewed both of the examples (and the source) prior to posting the request but I will definitely check again. I am not a coffeescript expert but, looking at job_class.coffee, I was not able to understand how the progress method was writing back to the collection. But...

For more context, I am not running processJobs in a separate worker using DDP but running directly on the main Meteor application against the job collection object as per the Read Me. Everything else in the worker behaves as expected (the jobs load, the worker executes them, and they become COMPLETED in the collection with any additional logs recorded and any final result added).

The culprit may be that the progress method uses a DDP call around line 774. So could it be that I need to provide an allow for the worker? Even so, I was not getting any exceptions.

Below is a mock up of what I am doing (until I can get back to the actual real code later):

// inside the job handler...
var arrayOfThings=[...]; // from the data object of the job
var totalItems = arrayOfThings.length;
for(var i=0;i<totalItems;i++)
{
    // do something....
    job.progress(i+1,totalItems);
}
job.done();
cb();
vsivsi commented 8 years ago

Hi, the code above looks good. You say:

I am not running processJobs in a separate worker using DDP but running directly on the main Meteor application against the job collection object as per the Read Me.

Does this mean that the worker code is executing in the Meteor server environment, or within a Meteor client (browser, Cordova app, etc)?

If it's on the server, then no allow/deny type permissions should be needed. If it's on a client, then yes, communication with the server is occurring via DDP (just as for a node.js worker), and allow/deny rules need to be set accordingly and the client needs to be properly authenticated (holds an auth token for an permitted user account.)

Probably the easiest way to sort out issues like this is to turn on logging and look for the activity in question to see what is happening: https://github.com/vsivsi/meteor-job-collection#logging https://github.com/vsivsi/meteor-job-collection#jcsetlogstreamwritestream---server-only https://github.com/vsivsi/meteor-job-collection#jclogconsole---client-only

gary-menzel commented 8 years ago

Thanks again - I'll turn on the logging once I am back at the machine (day job/night job - and I am in Australia).

Yes - running on the Meteor server (and I had assumed from the documentation that I would not require any permissions settings in that case). I plan to control access through restricting client access to the Meteor methods I will publish. I'll only be moving to the DDP client for workers when I need to scale later.

Fortunately, it is not really hampering any development work as the rest of the mechanism is working perfectly and "progress" tracking will mainly be of benefit when I get longer running batched jobs that will require a UI to indicate the progress.

Again, I'll report back here (and close off) if I find a solution or provide additional details.

gary-menzel commented 8 years ago

And, I should say, I am very impressed with the engineering you have done on this. I am working on a startup project and had looked at a couple of other options for background processing and was almost ready to dedicate the time to code this myself (and had started a design). But it appears you have already covered all the bases I had been considering and then some.

As I said earlier, I am not very clued up on coffeescript but I seem to be able to understand most of what you have written without needing to refer to any coffee doco (which hasn't been the case for me with other packages written in coffee). I guess that just means there is good and bad code in every language.

I don't yet have a real requirement for large files but I will definitely be considering your meteor-file-collection package should I require it (it looks simpler than collectionFS and seems to be more current).

gary-menzel commented 8 years ago

Hello again.... I turned logging on and as far as I can tell from the output (below) there are no errors being triggered yet the record in the collection still only shows completed:1. total:1. Further, I have also added each of the 5 objects to a result variable which is passed to the done call (the result arrives in the collection record as expected). So, I am probably going to leave this issue (still OPEN) for now until I come to building the management UI I will need.

I20160902-21:19:48.492(8)? Fri Sep 02 2016 21:19:48 GMT+0800 (AWST), [SERVER], getWork, params: [["send.referral"],{"maxJobs":1}]
I20160902-21:19:48.492(8)? Fri Sep 02 2016 21:19:48 GMT+0800 (AWST), [SERVER], getWork, returned: [{"_id":"DKMs58HEaYb4EXA3m","runId":"yosYQgLeBDHvRHzwh","type":"send.referral","data":{"referral":[{"email":"ki@letocin.ni","firstName":"Abbie"},{"email":"tum@tezwi.ad","firstName":"Adelaide"},{"email":"jikim@po.my","firstName":"Frederick"},{"email":"na@ifocahu.bh","firstName":"Albert"},{"email":"zeem@onuohuok.kg","firstName":"Mamie"}],"invitor":{"name":"Admin","email":"it_admin@spreadx.com.au","userId":"vPNAGSYScPCoXaAPT"}},"status":"running","updated":"2016-09-02T13:19:48.486Z","created":"2016-09-02T13:19:46.770Z","priority":0,"retries":0,"repeatRetries":1,"retryWait":300000,"retried":1,"retryBackoff":"constant","retryUntil":"+275760-09-13T00:00:00.000Z","repeats":0,"repeatWait":300000,"repeated":0,"repeatUntil":"+275760-09-13T00:00:00.000Z","after":"2016-09-02T13:19:46.773Z","progress":{"completed":0,"total":1,"percent":0},"depends":[],"resolved":[]}]
I20160902-21:19:48.503(8)? Fri Sep 02 2016 21:19:48 GMT+0800 (AWST), [SERVER], jobProgress, params: ["DKMs58HEaYb4EXA3m","yosYQgLeBDHvRHzwh",1,5,{}]
I20160902-21:19:48.503(8)? Fri Sep 02 2016 21:19:48 GMT+0800 (AWST), [SERVER], jobProgress, returned: true
I20160902-21:19:48.505(8)? Fri Sep 02 2016 21:19:48 GMT+0800 (AWST), [SERVER], jobProgress, params: ["DKMs58HEaYb4EXA3m","yosYQgLeBDHvRHzwh",2,5,{}]
I20160902-21:19:48.505(8)? Fri Sep 02 2016 21:19:48 GMT+0800 (AWST), [SERVER], jobProgress, returned: true
I20160902-21:19:48.506(8)? Fri Sep 02 2016 21:19:48 GMT+0800 (AWST), [SERVER], jobProgress, params: ["DKMs58HEaYb4EXA3m","yosYQgLeBDHvRHzwh",3,5,{}]
I20160902-21:19:48.506(8)? Fri Sep 02 2016 21:19:48 GMT+0800 (AWST), [SERVER], jobProgress, returned: true
I20160902-21:19:48.508(8)? Fri Sep 02 2016 21:19:48 GMT+0800 (AWST), [SERVER], jobProgress, params: ["DKMs58HEaYb4EXA3m","yosYQgLeBDHvRHzwh",4,5,{}]
I20160902-21:19:48.509(8)? Fri Sep 02 2016 21:19:48 GMT+0800 (AWST), [SERVER], jobProgress, returned: true
I20160902-21:19:48.511(8)? Fri Sep 02 2016 21:19:48 GMT+0800 (AWST), [SERVER], jobProgress, params: ["DKMs58HEaYb4EXA3m","yosYQgLeBDHvRHzwh",5,5,{}]
I20160902-21:19:48.511(8)? Fri Sep 02 2016 21:19:48 GMT+0800 (AWST), [SERVER], jobProgress, returned: true
I20160902-21:19:48.524(8)? Fri Sep 02 2016 21:19:48 GMT+0800 (AWST), [SERVER], jobDone, params: ["DKMs58HEaYb4EXA3m","yosYQgLeBDHvRHzwh",{},{}]
I20160902-21:19:48.524(8)? Fri Sep 02 2016 21:19:48 GMT+0800 (AWST), [SERVER], jobDone, returned: true
I20160902-21:19:48.528(8)? Fri Sep 02 2016 21:19:48 GMT+0800 (AWST), [SERVER], getWork, params: [["send.referral"],{"maxJobs":1}]
I20160902-21:19:48.528(8)? Fri Sep 02 2016 21:19:48 GMT+0800 (AWST), [SERVER], getWork, returned: []
vsivsi commented 8 years ago

Actually, once you call job.done() the completed and total fields are automatically set to one. This isn't a bug, but a feature, so that you don't need to call job.progress() to get to 100% before calling job.done(). Also, reporting progress is optional (and arbitrary), so this ensures that all completed jobs end up in a consistent and comparable state.

Code: https://github.com/vsivsi/meteor-job-collection/blob/master/src/shared.coffee#L1030

gary-menzel commented 8 years ago

Ahhhh - ok. That explains what is happening. So, essentially, if I set up a UI on the reactive collection, then the progress SHOULD show up while the job is running and then it will be set to 1/1/100% when done is called (and I can use my result or the log to see the status of each step.

I think that then makes this closed from my perspective.