edgurgel / verk

A job processing system that just verks! 🧛‍
https://hex.pm/packages/verk
MIT License
723 stars 65 forks source link

Defining contextual data for worker processes #198

Closed mskv closed 4 years ago

mskv commented 4 years ago

I have a feature suggestion (or maybe it's already supported somehow?).

Maybe it would be easier to start with use cases:

For instance Sidekiq achieves this through "middlewares" - pluggable pieces of code that can run before equeueing, and around processing jobs: https://github.com/mperham/sidekiq/wiki/Middleware

From what I undestand Verk only offers read-only access to job lifecycle through the Event Manager. But what I would need would be a way to plug some code into the worker process itself to have access to its process dictionary.

All of the above could be achieved currently by using Job's args. When enqueuing a job, I could include any metadata I need in the args and then every single perform callback in my job definitions would need to expect those metadata and handle them. This is not very convenient though for global configuration like the correlation-id inclusion.

What do you think about this? Or maybe am I missing something in the current implementation?

edgurgel commented 4 years ago

Hey @mskv,

I'm not 100% sure if I understood what you are looking for but we currently have the whole Job information available inside the Process dictionary: https://github.com/edgurgel/verk/blob/20af6bbfa3fb621e267e2909e44a7ded7b7eaf99/lib/verk/worker.ex#L14

Would this help with your problem?

mskv commented 4 years ago

Thanks for the link. I don't think it helps in this case. Maybe I'll expand on the correlation-id example.

I have a web server. At the beginning of each request it generates a correlation-id. It adds it to Logger metadata. So every single log message generated when handling the request contains correlation-id. When the process handling the web request enqueues a Verk job, I would like to attach this correlation-id to the job. Then, when this job is picked up by the worker process, the correlation-id could get attached to Logger metadata. This way not only the whole request handling is correlated, but also everything happening in the background.

This is already possible - just attach it to the args of a job and manually handle it in perform callback:

Verk.enqueue(%Verk.Job{
  queue: :default, 
  class: "ExampleWorker", 
  args: [1,2, Logger.metadata]
)
defmodule ExampleWorker do
  def perform(arg1, arg2, logger_metadata) do
    Logger.metadata(logger_metadata)

    arg1 + arg2
  end
end

The only way not to do this for every single worker module would be metaprogramming.

I guess my question is whether I was missing something in this regard. I wanted to point to Sidekiq Middlewares as a streamlined example of how they handle this. But I understand it may be outside the scope of the project, since - as above - it's already possible to achieve this using existent tools.

edgurgel commented 4 years ago

@mskv ,

Yeah I think the best approach is to include this metadata as part of the arguments. Maybe always using the first argument as metadata as a hash so you can add whatever is useful to track this information?

mskv commented 4 years ago

Thanks, will go that way probably, closing the issue.