spiral-modules / jobs

RoadRunner: Background PHP workers, Queue brokers
MIT License
59 stars 11 forks source link

How can I find out that the job ended with an error #17

Open myavchik opened 5 years ago

myavchik commented 5 years ago

Hello. How can I find out (or catch or log) that the job ended with an error. The BaseTest check the same $this->assertNotEmpty($id); for both ("error job" and "success job")

I found an example in spiral/framework JobDispatcher.php:

        $consumer->serve(function (\Throwable $e = null) {
            if ($e !== null) {
                $this->handleException($e);
            }
            $this->finalizer->finalize(false);
        });

Is this the only way to catch it? Any other ways to catch it in "producers" ?

wolfy-j commented 5 years ago

Hi,

currently, there is no centralized storage broker which allows you to check the job status by its ID. This can be added on either PHP or Golang end (more reliable I think).

Can you describe the desired flow in more details so we can plan it properly? Our team has been requesting the same functionality and I want to make sure it's aligned.

wolfy-j commented 5 years ago

We are using https://aws.amazon.com/en/swf/faqs/ to run complex workflows, it is possible to implement a similar pattern (decider + worker) but it will require at least another data storage to keep the execution state.

myavchik commented 5 years ago

Swf similar pattern will be great, much more then i expected. At first I was thinking about something simular to amqp "Message acknowledgment" https://www.rabbitmq.com/confirms.html

In order to make sure a message is never lost, RabbitMQ supports message acknowledgments. An ack(nowledgement) is sent back by the consumer to tell RabbitMQ that a particular message has been received, processed and that RabbitMQ is free to delete it.

If a consumer dies (its channel is closed, connection is closed, or TCP connection is lost) without sending an ack, RabbitMQ will understand that a message wasn't processed fully and will re-queue it. If there are other consumers online at the same time, it will then quickly redeliver it to another consumer. That way you can be sure that no message is lost, even if the workers occasionally die.

wolfy-j commented 5 years ago

All the brokers included in jobs already implement ASKs, including RabbitMQ. What we currently lack is complex logic here (a.k.a. what to do when job already failed):

https://github.com/spiral/jobs/blob/master/service.go#L308

This method will be invoked if the job fails after multiple attempts. I guess we can improve it in the future to automatically send the application the notice about the dead job and let it decide what to do next. It can even be PHP callback.

Technically implementing something like SWF is very simple, we can push an event to the application after each job execution/failure and let your code to decide what to do next or update the state machine.

I guess this is something we can drastically improve after the release and completion of documentation.

wolfy-j commented 5 years ago

Actually, all the job events are already available and can be used to implement a state machine. But on Golang end, we have to expose them to PHP as well.