corazawaf / coraza

OWASP Coraza WAF is a golang modsecurity compatible web application firewall library
https://www.coraza.io
Apache License 2.0
2.13k stars 209 forks source link

Feature request: allow passing a context.Context as part of a transaction #919

Closed MagicalTux closed 4 months ago

MagicalTux commented 10 months ago

Summary

WithErrorCallback() callback should have access to a context related to the processed query in order to allow logging to be associated with the originating query.

This can be implemented without breaking existing code by adding an option to associate a context to a transaction (by default http.Request.Context() should be used) and add the ability to fetch said context in [types](https://pkg.go.dev/github.com/corazawaf/coraza/v3@v3.0.4/types).[MatchedRule](https://pkg.go.dev/github.com/corazawaf/coraza/v3@v3.0.4/types#MatchedRule).

Basic example

github.com/corazawaf/coraza/v3/http should set the context. It can be used like this:

func logWafError(error types.MatchedRule) {
        slog.Log(error.Context(), slogLevel(error.Rule().Severity()), error.ErrorLog(), ...)
}

Motivation

Go has implemented context.Context for some time and this is available as part of http.Request among other things. It can be used to pass various information and the new log/slog also has ability to accept contexts. It would be great for me to associate coraza logs with the original request, and the context would make that possible.

anuraaga commented 9 months ago

This seems reasonable to me, I would add NewTransactionContext to follow the standard naming convention, and http middleware could probably pass the request context by default

jcchavezs commented 9 months ago

I have a few thoughts on this:

I wonder @MagicalTux if you are using the http middleware provided by coraza. If not, you can call the tx.MatchedRules right before closing the transaction, if yes we have been talking with some folks here to add hooks on the middleware at the beginning and at the end of the request to be able to perform and action based on the transaction and I think you could accomplish this.

My main doubt around using a single context in a transaction is that

  1. It is unclear what happens on context cancellation or a context deadline.
  2. Transaction holding a context just for passing to the error callback feels weird since we don't use it anywhere else but agree accesing the context somehow is a legit case.
  3. I always thought that context could play a role on the request/response processing e.g. tx.ProcessRequestBody where it makes sense to pass a deadline, how would this play with a context per transaction?
anuraaga commented 9 months ago

IMO Go made a mistake conflating data and deadline in context, and we indeed have had issues with how to deal with the latter before because of it. Adding some state to the transaction for referencing in callbacks is reasonable though so in that case I think either

Is fine. I lean towards the former since I think it's very common including in slog but latter seems ok too

MagicalTux commented 9 months ago

@jcchavezs

I do use the provided http middleware (http.WrapHandler) as it was easier. I guess I could just copy the code and directly do all the steps to get things working but hooks would work too I guess.

Still having the ability to pass context would be useful since contexts aren't only about deadlines or cancellation, but also passing various kind of information. Cancel/deadlines can be ignored for the context of things such as logging (log/slog) and such.

Another advantage is that http.Request which the middleware takes in already has a context, which means the existing implementation can automatically fetch the context from there. Actually if just the request was exposed in the log side it could be used to fetch the context (solve my issue) and can also provide a lot of useful information for logging.

Right now the only things the logging callback has access to is types.MatchedRule which contains some information about the request (URI(), etc). Anything would be helpful, a custom map, a context.Context or the http.Request object would do.

MrWako commented 7 months ago

Hi Coraza team - just an upvote on this proposal. Previously I experimented with earlier versions of the WAF and didn't get on to well but v3 seems a big step forward. Nice work!

We're using the coraza provided middleware within our own existing proxy Go reverse proxy - there's a fair amount of logic and complexity in the WrapHandler method and I didn't want to have to copy and maintain this separately. The middleware worked pretty seamlessly but the major usability/integration issue we have is not having access to the context in the error call back function. We decorate the incoming request context with correlation IDs and other information then output structured/JSON logs. Log analysis is largely done by filtering on these structured fields. If we can get the context into the error call back function we could properly integrate the Coraza output into our logs. I imagine this is a pretty typical pattern.

The solution that would work for me is to have the request context as a field on types.MatchedRule (non-breaking change). I really don't want to have to copy and paste the WrapHandler code ... although I see this look like what had to be done for the Caddy plugin: https://github.com/corazawaf/coraza-caddy/blob/main/http.go

jcchavezs commented 7 months ago

Started this PR https://github.com/corazawaf/coraza/pull/962 cc @MrWako @MagicalTux

jptosso commented 7 months ago

Can we close this? as this is already implemented as experimental

fzipi commented 4 months ago

Closing.