corazawaf / coraza-proxy-wasm

proxy-wasm filter based on Coraza WAF
Apache License 2.0
115 stars 24 forks source link

Connect the dots with observability #166

Open jcchavezs opened 1 year ago

jcchavezs commented 1 year ago

Right now there is no trivial way of connecting audit logs or debug logs (properly coraza logs) with the underlying requests or their consequent proxy logs (e.g. envoy logs). transaction ID is one identifier associated with the WAF transaction (aka the request in the server) and is local to the server request processing.

There are a couple of options here we could explore:

  1. use a request-id as transaction ID: This is one way but isn't ideal request-id is the same across the hops in the distributed request, meaning that transaction-id will be the same for all component in the request that are behind a WAF.
  2. use span ID as transaction ID: while this is more accurate than the previous one, it depends on (distributed tracing) propagation format and if using single header, coraza needs to extract the span ID from the header.
  3. Allow the auditlogs to include extra information based on variables or directly REQUEST_HEADERS variable (e.g. REQUEST_HEADERS:X-Request-ID: This is probably the easiest approach and does not need to happen in seclang necessarily but in the config of the WAF. A new auditlogpart X would be needed to include all these extra fields. Whenever you want to correlated a request with a transaction, look for the request ID in the audit logs. Note: currently audit logs support printing the request headers but doing that for the sake of a single header is not only overkill but also a security concern as there is no redaction of potential sensitive information or PII.
  4. Allow proxy-wasm to pass a response header to envoy and envoy log that in envoy logs: This would require a massive effort, starting for changing the ABI to support this data exchange.

I would love to hear some input from @basvanbeek and @wu-sheng on this.

wu-sheng commented 1 year ago

I think the key is what is granularity of the dot. If the duration of each dot would generate a span? What is the expected duration of the span? 10+ nanosec or more?

anuraaga commented 1 year ago

This seems like an issue for Envoy, any filter may log so it makes sense to have a consistent story for correlation, presumably by having Envoy provide span/trace ID, maybe even within the wasm ABI handler without needing anything special inside filters.

jcchavezs commented 1 year ago

So I think we are narrowing down the scope of this discussion to the simplest approach which can be done in a timely manner. Not so long ago I started an issue (see https://github.com/envoyproxy/envoy/issues/21959) to discuss a way to enrich observability data in envoy proxy but my feeling is that it will be a looong run until that can make to any master (either envoy or proxy-wasm, proxy-wasm being the main bottleneck).

First of all, I avoided talking about spans for filters because of different reasons: first one being the lack of observability in envoy filters itself (no duration, no start, no end, no notice on who did the blocking, nor the reason) and scoped the conversation around how can Coraza enable this case in the near future, more so we should not only limit this to envoy but other proxies to so make sense provide a way to do so here aswell. Do you think that is reasonable?

wu-sheng commented 1 year ago

I agree increasing observability makes perfect sense. Are those filters running parallel in general case and envoy case? If so, we are better to bind those with spans.

anuraaga commented 1 year ago

Do you think that is reasonable?

Coraza is a WAF - I think we already have that functionality available. Adding specific functionality for observability to coraza wasm seems like outside its scope, it just feels like anything would be too ad hoc.

fzipi commented 1 year ago

I tend to agree here. Why add something that the web server (or envoy) should already be doing?

jcchavezs commented 1 year ago

So the main point here is "will proxies do this?" I think the answer is, yeah they should expose and API to do so but at the moment they won't and I don't see a clear path to implement this (proxy wasm is not accepting new features for now until someone takes ownership of it) so that is why I think in the meantime we ca provide this to users as a UX improvement.

On Thu, 9 Mar 2023, 15:48 Felipe Zipitría, @.***> wrote:

I tend to agree here. Why add something that the web server (or envoy) should already be doing?

— Reply to this email directly, view it on GitHub https://github.com/corazawaf/coraza-proxy-wasm/issues/166#issuecomment-1462187210, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAXOYAWN2WBHITMBYL3XEETW3HUSVANCNFSM6AAAAAAVU7ECIQ . You are receiving this because you authored the thread.Message ID: @.***>

anuraaga commented 1 year ago

It seems like option 3), allowing selecting keys from variables for audit logging, is probably the option that makes sense for a lightweight integration, really Coraza shouldn't be parsing any headers for it. Can move to Coraza repo maybe

jcchavezs commented 1 year ago

This same concern is valid for things like http middleware in Coraza. I will open an issue in coraza itself and we can evolve the design there.

jcchavezs commented 1 year ago

Related: https://github.com/proxy-wasm/spec/pull/5

jcchavezs commented 1 year ago

Related https://github.com/corazawaf/coraza-proxy-wasm/issues/209#issuecomment-1605994021

cc @ericinfra

mateuslima90 commented 12 months ago

I wanted to do correlation between the header x-request-id and the rules matched in the request. So I did a test and I updated a rule with id 941100. SecRule REQUEST_COOKIES|!REQUEST_COOKIES:/__utm/|REQUEST_COOKIES_NAMES|REQUEST_HEADERS:User-Agent|ARGS_NAMES|ARGS|XML:/* "@detectXSS" \ "id:941100,\ phase:2,\ block,\ t:none,t:utf8toUnicode,t:urlDecodeUni,t:htmlEntityDecode,t:jsDecode,t:cssDecode,t:removeNulls,\ msg:'XSS Attack Detected via libinjection',\ logdata:'Matched Data: XSS data found within %{MATCHED_VAR_NAME}: %{MATCHED_VAR} - x-request-id2: %{request_headers.x-request-id}',\ tag:'application-multi',\ tag:'language-multi',\ tag:'platform-multi',\ tag:'attack-xss',\ tag:'paranoia-level/1',\ tag:'OWASP_CRS',\ tag:'capec/1000/152/242',\ tag:'x-request-id3: %{request_headers.x-request-id}',\ ver:'OWASP_CRS/4.0.0-rc1',\ severity:'CRITICAL',\ setvar:'tx.xss_score=+%{tx.critical_anomaly_score}',\ setvar:'tx.inbound_anomaly_score_pl1=+%{tx.critical_anomaly_score}'"

In the log show in this form: [2023-11-28 02:58:46.559][30][critical][wasm] [source/extensions/common/wasm/context.cc:1180] wasm log coraza-filter coraza-filter coraza-filter_vm_id: [client \"x.x.x.x\"] Coraza: Warning. XSS Attack Detected via libinjection [file \"@owasp_crs/REQUEST-941-APPLICATION-ATTACK-XSS.conf\"] [line \"7391\"] [id \"941100\"] [rev \"\"] [msg \"XSS Attack Detected via libinjection\"] [data \"Matched Data: XSS data found within ARGS_GET:arg\\\\: <script>alert(0)</script> - x-request-id2: 93ded0b5-cfa0-9ca6-a723-89f6a88e4fa6\"] [severity \"critical\"] [ver \"OWASP_CRS/4.0.0-rc1\"] [maturity \"0\"] [accuracy \"0\"] [tag \"application-multi\"] [tag \"language-multi\"] [tag \"platform-multi\"] [tag \"attack-xss\"] [tag \"paranoia-level/1\"] [tag \"OWASP_CRS\"] [tag \"capec/1000/152/242\"] **[tag \"x-request-id3: %{response_headers.x-request-id}\"]** [hostname \"x.x.x.x\"] [uri \"/x/y/z/a/testing?arg\\\\=\\\\<script\\\\>alert\\\\(0\\\\)\\\\</script\\\\>\"] [unique_id\"LBXNdqvkszpuhpJEuHD\"]

Are there other configuration to do this?

ryanobjc commented 10 months ago

X-Request-ID would work well in a istio deployment scenario!