owasp-modsecurity / ModSecurity

ModSecurity is an open source, cross platform web application firewall (WAF) engine for Apache, IIS and Nginx. It has a robust event-based programming language which provides protection from a range of attacks against web applications and allows for HTTP traffic monitoring, logging and real-time analysis.
https://www.modsecurity.org
Apache License 2.0
7.67k stars 1.54k forks source link

Align "reference" field value in JSON transaction log with standard log. #3093

Open asamuta opened 2 months ago

asamuta commented 2 months ago

Transaction JSON log of "reference" is too verbose for some rules. There is a limit for "ref" field in "standard" log.

file: rule_message.cc

...
msg.append(" [ref \"" + utils::string::limitTo(200, rm->m_reference) + "\"]");
...

but there is no such limit in file: transaction.cc

...
LOGFY_ADD("reference", a.m_reference.c_str());
...

With this fix JSON out will be truncated as standard log.

Before:

...
"messages": [
      {
        ....
        "details": {
          "match": "Matched \"Operator `Gt' with parameter `10000' against variable `ARGS_COMBINED_SIZE' (Value: `971714.000000' )",
          "reference": "v0,31v32,32v0,31v32,74v0,32v33,166v0,33v34,5v0,31v32,32v0,30v31,165v0,33v34,193v0,33v34,227v0,33v34,41v0,33v34,47v0,33v34,222v0,35v36,9v0,28v29,10v0,25v26,237v0,35v36,62v0,31v32,20v0,31v32,20v0,31v32,20v0,31v32,19v0,31v32,19v0,31v32,20v0,31v32,20v0,31v32,20v0,31v32,19v0,32v33,19v0,25v26,26v0,25v26,39v0,25v26,321v0,25v26,216v0,25v26,147v0,25v26,229v0,25v26,295v0,25v26,147v0,25v26,219v0,26v27,412v0,26v27,357v0,26v27,254v0,26v27,232v0,26v27,309v0,26v27,76v0,26v27,116v0,26v27,32v0,26v27,346v0,26v27,211v0,26v27,467v0,26v27,363v0,26v27,79v0,26v27,190v0,26v27,205v0,26v27,217v0,33v34,17v0,33v34,17v0,33v34,16v0,33v34,17v0,33v34,17v0,33v34,17v0,33v34,16v0,33v34,17v0,33v34,17v0,34v35,17v0,..... 75 kilobytes of text here,
          "ruleId": "1100",
          "file": "/etc/nginx/modsec/owasp-modsecurity-crs/rules/REQUEST-900-EXCLUSION-RULES-BEFORE-CRS.conf",
          "lineNumber": "209",
         ...
        }
      }
...

after the fix:

...
"reference":"v0,31v32,32v0,31v32,74v0,32v33,166v0,33v34,5v0,31v32,32v0,30v31,165v0,33v34,193v0,33v34,227v0,33v34,41v0,33v34,47v0,33v34,222v0,35v36,9v0,28v29,10v0,25v26,237v0,35v36,62v0,31v32,20v0,31v32,20v0,31v32, (79218 characters omitted)",
...

As long as there is no way to disable the "reference" field in JSON transaction log and the information it contains doesn't look useful in production logs please approve this change. I think the JSON log approach may be reworked later but meanwhile, we need a way to truncate it.

sonarcloud[bot] commented 2 months ago

Quality Gate Passed Quality Gate passed

Issues
0 New issues

Measures
0 Security Hotspots
No data about Coverage
No data about Duplication

See analysis details on SonarCloud

airween commented 2 months ago

Hi @asamuta,

first of all: thank your for your contribution!

Could you share with us some example, how can we reproduce this message in audit log? (Eg. share some details (eg. used rule set, special settings) and a curl example).

asamuta commented 2 months ago

Hi @airween,

NGINX MOD_SECURITY v3.0.12 OWASP CRS V4.0 https://github.com/coreruleset/coreruleset/releases/tag/v4.0.0

Additional rule that must be placed in REQUEST-900-EXCLUSION-RULES-BEFORE-CRS.conf

SecRule ARGS_COMBINED_SIZE "@gt 1000"\
     "id:400001,\
     phase:2,\
     pass,\
     nolog,\
     auditlog,\
     msg:'REQ_SIZE',\
     logdata:'%{ARGS_COMBINED_SIZE}',\
     ctl:ruleRemoveTargetByTag=attack-xss;ARGS"

This is a simplified rule to demonstrate the issue.

Payload - a big json (the example is attached). example.json

CURL: curl --location 'localhost:8080/test' --header 'Content-Type: application/json' --data-binary "@./example.json"

NOTE: Nginx in my case configured as a reverse proxy with upstream that targets a mock server that responds with simple valid JSON to all requests.

asamuta commented 2 months ago

Hi @MirkoDziadzka , @airween , I have tried to find any documentation about what the reference field value is. Unfortunately without any luck. Could you please guide me to any documentation about what this reference value means?

"reference": "v0,31v32,32v0,31v32,74v0,32v33,166v0 ....

How to interpret these numbers? How do these numbers help a SOC team? Is it worth storing tens of kilobytes?

Ideally of course it should be possible to switch it on/off using "audit log parts" letters. But it will be a much more complicated task... I think the build time option is also a good point.

Thanks for your efforts on this. Waiting for your decision.

zimmerle commented 2 months ago

Hi @asamuta, I developed this feature to pinpoint the exact location (within the request body or any other data) where the operator returned true. It was intended to highlight the relevant fragments related to the match, making visualization easier in the web console for specific use cases. The issue you've encountered seems to be a bug, although I haven't delved into the details. If it's not necessary, you can easily disable this feature with a build-time flag, or even at runtime if needed. Over the years, I've customized this for various consulting clients. If it's not being utilized, it might indeed be superfluous to retain this data. I understand your concern; even a few bytes can add up to significant costs over a billion requests.

As per v2, I don't think this feature is available there.

MirkoDziadzka commented 2 months ago

I agree that this feature might not be used. Unfortunately, (if I remember correctly), many regression tests are using the debug output of the variable origin.

My concern with this PR: Assuming somebody is using this and parsing the line and it fails (e.g. simple python scripts would raise an exception) and some workflow is breaking.

I'm not sure if this is a concern for the modec team, but long term ago I worked as a system admin and I would not like such a change. But not my decision. It might ok to just remove it.

@asamuta regarding the format:

If we care to not break external parsers, we should probably split before "v"

asamuta commented 2 months ago

Hi All, As far as I understand the easiest option here is to create a build-time flag and cut the value before 'v' to keep parsers working. Please confirm that such a change will be accepted.

If so, please point me to such "build-time flag" so that I can implement this based on that example. Also, it would be great if you could advise me the name of the new flag according to the naming convention used in the project. Thanks in advance.

airween commented 2 months ago

Please confirm that such a change will be accepted.

I can agree an optional, not mandatory modification.

If so, please point me to such "build-time flag" so that I can implement this based on that example. Also, it would be great if you could advise me the name of the new flag according to the naming convention used in the project. Thanks in advance.

As I know libmodsecurity3 does not have any kind of configure option which can be used to demonstrate the example. But mod_security2 has. You should check out the branch v2/master (or it would be better if you make a new clone and check out there), run the ./autogen.sh command, and then the ./configure --help. There you can see some similar solution, eg. --enable-pcre-match-limit or --enable-pcre-match-limit-recursion. For the syntax, check the configure.ac. Please find the references in source code for symbol MODSEC_PCRE_MATCH_LIMIT or MODSEC_PCRE_MATCH_LIMIT_RECURSION.

If you don't want to check the source tree and the build system's behavior, just see the configure.ac file relevant parts.

For the name, I can imagine the --enable-truncate-reference-in-auditlog or something similar, with a numeric argument, which indicates how much is the maximum length of that field, eg --enable-truncate-reference-in-auditlog=200 which means the maximum length can be 200 characters (or the last occurrence of v{offset},{length} pattern). Please make an accurate note for that option.

The option name is not the best choice, so please help me to figure out the correct one.

airween commented 1 month ago

To everyone: what do you think about to make this feature optional? Do you think we can use the configure flag --enable-truncate-reference-in-auditlog or --enable-truncate-reference-in-auditlog=NNN?

@asamuta could you prepare this patch with this suggestion? May be you should replace the name of the flag, but that's it, I guess.