snok / asgi-correlation-id

Request ID propagation for ASGI apps
MIT License
413 stars 32 forks source link

Add generated request ID to the request headers #64

Closed tlinhart closed 1 year ago

tlinhart commented 1 year ago

It might be a good idea to add the newly generated request ID also to the request headers. There may be cases where we can't or don't want to use the value from context variable and rely solely on the value in the request headers (e.g. other third party middlewares). This is especially true when there is a header in the incoming request but it doesn't pass the validator and thus a new value is generated. In that case, the header value is inconsistent with the context variable value and the value present in logs.

If this sounds reasonable, I could try to prepare a PR.

sondrelg commented 1 year ago

I think writing your own middleware that sets a request header is probably a better solution, unless you have a more general motivation for this feature that is clearly useful for other users. I'm not against it, but if you feel this is useful, could you elaborate on how?

tlinhart commented 1 year ago

I actually initially did just that, wrote my own middleware that adds the request ID to request headers. It is an ASGI version of the WSGI middleware I use on another project for years. Then I found this project which seems for cover all use cases one might possibly need, with this little exception.

As an additional motivation, consider various tracing middlewares which use request headers to set the span context. For example these two:

sondrelg commented 1 year ago

So you're saying, when an incoming request has no (for example) 'x-request-id' header present, and we generate one OR for the code path where it is present and we replace it, it would be useful to overwrite this request header with our set value?

I can see how it would be hard to automate tracing without this, so I'd consider myself convinced :slightly_smiling_face:

Before moving forward with this, do you see any drawbacks? It feels a bit weird to modify the request object when there wasn't a header there to begin with, but I can't think of a specific reason why that would be true.

JonasKs commented 1 year ago

Took me a few reads to really understand the use cases behind this, but I am convinced as well.
Altering incoming headers may not be what people want tho, so I suggest having it as an optional.

tlinhart commented 1 year ago

Yeah, that's what I'm trying to say :-) To reiterate, I find this especially useful in the case when you replace the header value with a new one should the original value not pass the validator. In that case, tracing middlewares would contain the original value in their context, while response headers (and presumably logs) would contain the new one. That would be misleading and would make it quite hard to debug issues in any observability platform correlating metrics, logs and traces (see e.g. https://grafana.com/blog/2020/03/31/how-to-successfully-correlate-metrics-logs-and-traces-in-grafana/).

Modifying request headers is IMHO quite common. I've been using it in a whole stack which consists of several components where each of them either reuses the request ID from the previous one or creates a new one and passes it in the request ID header to the next one. For example, if the top-level component is a HAProxy acting as a load-balancer, I'd use something along these lines:

...
unique-id-format %{+X}o\ %Ts%pid%fi%fp%ci%cp%rt
acl request_id_exists req.hdr(X-Request-ID) -m found
http-request set-header X-Request-ID %[unique-id,lower] unless request_id_exists
http-request capture hdr(X-Request-ID) len 64
http-response set-header X-Request-ID %[capture.req.hdr(0)]
...

Next, there might be an nginx server in front of the WSGI/ASGI server. I'd use:

map $http_x_request_id $req_id {
    ~* $http_x_request_id;
    default $request_id;
}

server {
    location / {
        ...
        proxy_hide_header X-Request-ID;
        add_header X-Request-ID $req_id always;
        ...
    }
}

So I don't really see any problems with this approach. The only thing I think should be taken care of is the order of middlewares so the request ID header is guaranteed to be present before any other middleware relying on it tries to access it.

sondrelg commented 1 year ago

Sounds good :+1:

sondrelg commented 1 year ago

Took me a few reads to really understand the use cases behind this, but I am convinced as well. Altering incoming headers may not be what people want tho, so I suggest having it as an optional.

Missed this, yeah I think an opt-in might be best for a first iteration, but would be happy to see your suggested implementation first @tlinhart, then we can discuss implementation details in the PR

tlinhart commented 1 year ago

Here's the initial stab on it, please take a look – https://github.com/snok/asgi-correlation-id/pull/65.