Yelp / python-gearman

Gearman API - Client, worker, and admin client interfaces
http://github.com/Yelp/python-gearman/
Other
242 stars 124 forks source link

Is there a way to get a job's status given a job handle? #61

Open jrdmcgr opened 10 years ago

jrdmcgr commented 10 years ago

I'm trying to submit a task, and then later come and check on it's status. At this point I don't have any reference to the job or the request. I only have the job handle.

I've looked through the API documentation and can't find any method that will do this. But, I've noticed that the gearman protocol has a command to accomlish this.

Any help is appreciated.

oxymor0n commented 9 years ago

i don't know about the protocol part, but what you can do is to save the job request object being returned by the submit_job() method. This request object has a "state" attribute that tells you whether the job is created, failed, completed, etc., which you can query later easily simply by checking the value of request.state

niklasfemerstrand commented 9 years ago

I ran into this obstacle and came up with a solution. Even though the Gearman protocol allows this via GET_STATUS, the python-gearman implementation has, as far as I've seen, no method for doing it. My solution was to dig into the Gearman internals and craft a GET_STATUS packet, perhaps this can help someone else.

GET_STATUS

A client issues this to get status information for a submitted job.

Arguments:

  • Job handle that was given in JOB_CREATED packet.

http://gearman.org/protocol/

The GET_STATUS packet header is crafted as follows:

byte_size = struct.pack(">I", len(handle))
header = "\x00\x52\x45\x51\x00\x00\x00\x0f" + byte_size

If you are working in an environment with multiple Gearman job servers then you need to keep track of what host the specific job was sent to. You can detect that on the return of submit_job(), when it returns job.connection.gearman_host and .gearman.port.

The GET_STATUS packet requires you to append the handle which is returned as job.handle when you submit_job().

Once the GET_STATUS header has been crafted you need to send it to the appropriate job server like:

socket.send("%s%s" % (header, handle))

Keep in mind that GET_STATUS only works for background jobs, so the full solution would look something like this (untested, coding this example in Github's comment form for demonstration only):

job = gm.submit_job("reverse", "Hello world", wait_until_complete=False, background=True)
byte_size = struct.pack(">I", len(job.job.handle))
header = "\x00\x52\x45\x51\x00\x00\x00\x0f" + byte_size
socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
socket.connect((job.job.connection.gearman_host, job.job.connection.gearman_port))
socket.send("%s%s" % (header, job.job.handle))
status = socket.recv(20 + len(job.job.handle))
socket.close()

The returned status (you are looking for the last 4 bytes) is then:

STATUS_RES

This is sent in response to a GET_STATUS request. This is used by clients that have submitted a job with SUBMIT_JOB_BG to see if the job has been completed, and if not, to get the percentage complete.

Arguments:

  • NULL byte terminated job handle.
  • NULL byte terminated known status, this is 0 (false) or 1 (true).
  • NULL byte terminated running status, this is 0 (false) or 1 (true).
  • NULL byte terminated percent complete numerator.
  • Percent complete denominator.

http://gearman.org/protocol/

Happy hacking :-)

PS: In the solution where you have to poll the request.state you need to store the entire object to be able to check statuses from other pids. The raw packet solution above works with any pid as long as it can access the host, port and handle. I personally keep track of these in Redis.