snok / asgi-correlation-id

Request ID propagation for ASGI apps
MIT License
370 stars 29 forks source link

ARQ support #17

Closed sondrelg closed 1 year ago

sondrelg commented 2 years ago

Opening this to track progress on adding arq support.

For Sentry, we "extend" the range of a request ID by tagging events with a transaction_id. For Celery you can pass the request ID to the worker process in the job-event headers and load it into a context var on the worker with a pre-run signal.

For ARQ I think we will need:

  1. A mechanism for transferring the request ID from the webserver to the worker process.
  2. Some way of attaching the ID to a job in a way that makes sense

Point 1 might be possible already, or it might require an upstream PR in ARQ. There is an open issue here. I'd like to sit down an establish what's required soon, because I think the issue is a little unclear right now.

For point 2, I don't think the Celery implementation of having a context var will work for arq, since we're not running a sync process. Instead it would be great to figure out an appropriate, low impact way of associating a request ID to an arq job, which in turn could be associated with an asyncio task. Ideas on implementation are 100% welcome.

sondrelg commented 2 years ago

One way of implementing it might be to just maintain a context var of {task-id: request-id} and write to it every time a job is started. I suppose that would require pre- and post-run hooks to set and clean-up values.

For a little bit more context: this would enable us to set up a separate arq-logger, with an arq-specific filter that might look like this:

class ArqCorrelationId(Filter):
    def filter(self, record: LogRecord) -> bool:
        # Get job or task ID - whichever makes more sense
        task_id = asyncio...

        # Get CID wrt. task ID
        cid = arq_correlation_ids.get(task_id)

        # Attach to log record
        record.arq_correlation_id = cid
        return True
sondrelg commented 1 year ago

arq unfortunately (but understandably) seems unmaintained in the short term, so I've made the switch to saq which provides the event hooks needed. Documentation added in #47