Open zoulja opened 5 years ago
Bumping up. I also had to resort to some trickery in graylog 5.1. Same method too with regex101.com validation
rule "sanitize passwords" when
contains(value: "ass", search: to_string($message.message), ignore_case: true) == true
then
// Had to base64 encode it as it was braking parsing in some fashion
// original regex: [pP][Aa][Ss][Ss](?:[Ww][Oo][Rr][Dd])?[$=:\"]?[\":]?\s*[\"]?(\w+)[\"]?
let pattern_base64 = "W3BQXVtBYV1bU3NdW1NzXSg/OltXd11bT29dW1JyXVtEZF0pP1skPTpcIl0/W1wiOl0/XHMqW1wiXT8oXHcrKVtcIl0/Cg==";
let pattern = base64_decode(pattern_base64);
set_field("message",
regex_replace(
pattern: pattern,
value: to_string($message.message),
replacement: "[REDACTED]",
replace_all: true)
);
end
It's more than just regex, it appears to happen on complex GROK patterns too resulting in yet another base64 workaround. The UI breaking parsing prevents saving a valid, functioning pattern (grok or regex).
rule "extract nginx ingress controller log"
when
has_field("application_name") && to_string($message.application_name) == "controller.ingress-nginx"
then
// https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/log-format/
// https://github.com/ChrsMark/beats/blob/194bb7be9271814e51883c25453277fd72f6f767/filebeat/module/nginx/ingress_controller/ingest/pipeline.yml
let pat = "JXtJUDpyZW1vdGVfYWRkfSAtICV7VVNFUjpyZW1vdGVfdXNlcn0gXFsle0hUVFBEQVRFOnRpbWVfbG9jYWx9XF0gIiV7SFRUUF9NRVRIT0Q6cmVxdWVzdF9tZXRob2R9ICV7VVJJUEFUSFBBUkFNOnJlcXVlc3RfcGF0aH0gJXtIVFRQX1ZFUlNJT046aHR0cF92ZXJzaW9ufSIgJXtJTlQ6c3RhdHVzfSAle0lOVDpib2R5X2J5dGVzX3NlbnR9ICIle0dSRUVEWURBVEE6aHR0cF9yZWZlcnJlcn0iICIle0dSRUVEWURBVEE6aHR0cF91c2VyX2FnZW50fSIgJXtJTlQ6cmVxdWVzdF9sZW5ndGh9ICV7REVDSU1BTDpyZXF1ZXN0X3RpbWV9IFxbJXtIT1NUTkFNRTpwcm94eV91cHN0cmVhbV9uYW1lfVxdIFxbJXtHUkVFRFlEQVRBOlVOV0FOVEVEfVxdICV7SE9TVFBPUlQ6dXBzdHJlYW1fYWRkcn0gJXtJTlQ6dXBzdHJlYW1fcmVzcG9uc2VfbGVuZ3RofSAle0RFQ0lNQUw6dXBzdHJlYW1fcmVzcG9uc2VfdGltZX0gJXtJTlQ6dXBzdHJlYW1fc3RhdHVzfSAle1dPUkQ6cmVxdWVzdF9pZH0=";
let match_on = base64_decode(pat);
let results = grok(match_on, to_string($message.message), true);
set_fields(results);
end
Working GROK pattern
%{IP:remote_add} - %{USER:remote_user} \[%{HTTPDATE:time_local}\] "%{HTTP_METHOD:request_method} %{URIPATHPARAM:request_path} %{HTTP_VERSION:http_version}" %{INT:status} %{INT:body_bytes_sent} "%{GREEDYDATA:http_referrer}" "%{GREEDYDATA:http_user_agent}" %{INT:request_length} %{DECIMAL:request_time} \[%{HOSTNAME:proxy_upstream_name}\] \[%{GREEDYDATA:UNWANTED}\] %{HOSTPORT:upstream_addr} %{INT:upstream_response_length} %{DECIMAL:upstream_response_time} %{INT:upstream_status} %{WORD:request_id}
Currently Graylog Pipelines (and maybe some other parts) require complicated manual escaping in regexes. Even my pattern works perfectly after 10 minutes debug with https://regex101.com/ then I have to spend 20 minutes more trying to guess what exactly Graylog doesn't like in Pipeline configuration window in something like regex_replace.
Suggestion: make regex engine frontend more user-friendly, in perfect case it must accept already verified patterns as is, without messing with manual escaping