Closed huttarichard closed 8 years ago
Sorry for the delay, I'm looking into this now.
I have confirmed this as a bug:
j = new Job(testColl, jobType, { foo: "bar" })
j.save()
j.refresh()
j.cancel()
j.refresh()
j.restart()
j.refresh() # Check fails here.
I'll work on a fix over the weekend, shouldn't take too long.
@vsivsi sweet!
I dig little bit around, just to help you out:
https://github.com/vsivsi/meteor-job-collection/blob/master/src/shared.coffee#L728
here is problem, since if you restart job and job has 0 retried
it will do an {$inc: -1}
, which will make job retried
-1 and on refresh action it will validate doc and throw match error.
here is test https://github.com/vsivsi/meteor-job-collection/pull/184
Great, thanks, I'll try to get this wrapped up today.
Hi, so I've worked up a solution, on the dev
branch, but I'm not 100% satisfied with it.
The line that causes this issue is here: https://github.com/vsivsi/meteor-job-collection/blob/master/src/shared.coffee#L753
This bug is actually a little worse than it originally seemed. It's not just an "off-by-one" bug. If you pass, say 5 in options.retries
, then unless job.retried
> 5, the same failure you reported will occur.
The purpose of subtracting the new options.retries
value (default 1) from job.retried
is to keep job.retries + job.retried
constant. This is important, because this information is used to calculate the new value of job.retries
when a repeating job runs again. So, just removing L753 has the side-effect of increasing the number of retries on future repeating runs by the amount of options.retries
(default 1).
Here are the possible options as I see them, with pros / cons:
job.retried
at L753, and allow options.retries
on job.restart()
to have a value of zero (previously >= 1). This is what is currently implemented on the dev
branch. Pros: Simple change, allow previous behavior for jobs cancelled before they run out of retries. Cons: Any non-zero value for options.retries
(including the default) will cause the retry count of future runs of a repeating job to increase by options.retries
. This is a behavior change that seems relatively harmless, but I don't love it.job.retried
to be a negative integer. Pros: This is probably the simplest change, with the fewest side-effects. Cons: This is a subtle, but non-trivial change to the document model, which has been frozen for a long time, and so I don't want to do it lightly. Given the check that has long been in place (that you encountered) it is unlikely that any code out there will break by making this more permissive, but it introduces a set of possible states for job.retried
that are semantically unclear (what does a negative value for retried
actually mean? job.refresh()
.job.refresh()
. Pro: Easy, Con: Super hacktastic. 😝 repeatRetries
attribute to explicitly capture the proper state of job.retries
for the next repeat, without resorting to calculating it from the sum of job.retries
and job.retried
. Pro: Arguably the right way to fix this, how it should have been done in the first place. Cons: _By far_ the most disruptive. It involves a major change to the document model, and changes to the implementation of multiple methods and a fair amount of code. This requires more planning and testing effort to make sure nothing breaks. It is possibly a semver 2.0 level change, although perhaps not, if the method interfaces and behavior of the current attributes don't change.I'm leaning towards 2) or 4), but I need to look into how much work option 4) will really require.
Thoughts?
I was actually really confused about the -options.retries
(also reason why the I didn't make a pull request right away). For me (or others) there could be simple patch for now:
put scrub function into collection, this will prevent doc retried
property to be an negative int.
scrub(doc) {
if(doc.retried < 0) doc.retried = 0;
return doc;
}
and we can think about one of the options you wrote above. I believe that you know this source best, so its pointless for me to get deeper, because from 99% I will endup where you just did. So here is my suggestion.
4) sounds awesome, will require more work, but mostly more testing and thats the part which can be done from me side. I can write more specs, that is no problem. In a meantime you can implement option 2) which is 100% better then scrub function.
what does a negative value for retried actually mean?
I wouldn't be concerned about this, because chance that you will get in touch with retried
negative value is really small :)
Alright, I'm going to switch the dev branch to option 2) because it's easy, while I continue to investigate 4). If necessary, I have no problem doing a release with option 2), it's basically the status quo without the check failure. Code-wise I'm convinced it's safe, I just don't like the semantics, but that 98% affects only me.
FYI, dev
branch is now option 2), works fine.
Okay, 4) wasn't too bad. It's on the repeatRetries
branch. Note that this also requires an update to the meteor-job
sub-project in the job
directory (which also now has a repeatRetries
branch).
Please test with this branch and let me know how it goes. The new repeatRetries
attribute of the job document is optional, so this is not a breaking change. Existing databases populated with old jobs should continue to work. And new job documents remain compatible with older servers and clients as well.
Not sure if you've done package development in Meteor before or not. To run this in your app, the following should work:
cd <yourAppDir>
mkdir packages
cd packages
git clone --recursive https://github.com/vsivsi/meteor-job-collection.git
cd meteor-job-collection
git pull
git checkout repeatRetries
cd job
git pull
git checkout repeatRetries
cd ../../..
meteor
@vsivsi awesome, really awesome. Cannot wait ... tomorrow I will write some specs about this (in my app). So going to let you know how it goes and if I find out something interesting. I really appreciate your effort. Thanks.
All of the changes proposed to go into 1.4.0 have now been merged into the dev
branch (including repeatRetries
). Please test against the dev
branch now. I'll release as soon as I hear back that everything looks good on your end.
@vsivsi ok ok, Im on it...
@vsivsi I removed old version of job collection (+ scrub function), push new one. Run all specs and all passes (three specs just for this cancel / refresh issue)... Looks good for me...
Great! I'm just chasing a test failure in meteor-job that only occurs on node.js 0.10, (related to a different new feature) not sure how long that will take, but once I understand why node 0.10 is behaves differently it should be quick to fix the test. So, I should release within a few hours.
PS. When I wrote:
Please test against the dev branch now
By "now", I meant to now use the dev
branch instead of the repeatRetries
branch, not to command you to do it immediately! 😄
By "now", I meant to now use the dev branch instead of the repeatRetries branch, not to command you to do it immediately!
😄 😄 😄 ahh, doesn't matter anymore
Fix just published to Atmosphere as version 1.4.0. Thanks again for reporting this!
@vsivsi thank you man!
First of all, great job with collection, awesome piece of sw!! I was working with job collection for a while, but I still unpredictable run into one issue:
Did not found the source (no coverage no specs) and partialy I was convinced that is probably my mistake. But now I think that I found a bug in shared.coffee
so stack is pointing to:
https://github.com/vsivsi/meteor-job-collection/blob/master/src/shared.coffee#L395
Here is what I did:
I put scrub function in job collection which shows me that retries value is -1 which go againts gte 0.
vsivsi:job-collection@1.3.3 METEOR@1.3.2.4