Closed ysmolski closed 10 years ago
The scenario you describe is hard to solve with the current beanstalkd protocol[1]. The only way I can come up with is to call stats-job to retrieve 'time-left' and comparing with that. If the job have timed out, don't send the delete.
I think the most 'sane' place to call stats-job would be in the Job-constructor. Thoughts?
The delete behaviour ist not really a problem but rather fundamentally correct as far as the protocol is concerned. Any job in "ready" state can be deleted by any worker. From the protocol spec: "A client can delete jobs that it has reserved, ready jobs, and jobs that are buried."
Correct. I was referring to that it is rather hard for a client to know wether the current job being processed is still reserved on the server by the client itself. It also require the stats-job call to figure out how long the job have left, but even then it is not possible to determine wether you are actually the one who have reserved it or not.
To elaborate a bit on the problem provided by ysmolsky: Worker 1 reserves the job. Worker 1 is disconnected from the server before completion, preventing deletion of the job when done. The job times out on the server and is and is put to ready state. Worker 2 reserves the job. Network is back for worker 1. Worker 1 issues the delete. Worker 2 completes and issues a delete resulting in "NOT_FOUND\r\n" as response, which raises the CommandFailed-exception for no apparent reason to worker 2.
You cannot delete jobs which are currently reserved by another worker, that's prohibited by the protocol / server. So in this scenario, once worker 1 reconnects the job is reserved by worker 2, so when worker 1 tries to delete the job, this delete will simply fail.
Ah! I did not know that. Thanks!
On 11/11/2013, at 01.10, Andreas Bolka notifications@github.com wrote:
You cannot delete jobs which are currently reserved by another worker. So in this scenario, once worker 1 reconnects the job is reserved by worker 2, so when worker 1 tries to delete the job, this delete will simply fail.
— Reply to this email directly or view it on GitHub.
I see now. That's really nice. So my addition seems to be correct according protocol.
BTW, tests in created documents are being passed.
Gonna do pull request.
Pull request: https://github.com/earl/beanstalkc/pull/38
I have implemented class called NetworkSafeConnection [1].
It handles two problems of regular Connection:
The only problem is that if socket error happened while job was reserved and not deleted by say worker1, after server restarts it will be available for reservation for some other workers. But same time it can be deleted by worker1 when connection will be restored.
Please let me know how do you feel about such extension of functionality. Do you think others can benefit from it?