Graylog2 / graylog2-server

Free and open log management
https://www.graylog.org
Other
7.34k stars 1.05k forks source link

pipeline rules get executed even if you have an and and it shouldn't match #12393

Open paladox opened 2 years ago

paladox commented 2 years ago

Expected Behavior

I expect the rule to only get triggered if all the parameters are true (not one or the other).

Current Behavior

The rule gets triggered for nginx and mediawiki message. It should only be triggered for the mediawiki one.

But it's getting executed on messages it shouldn't be (for example nginx).

Steps to Reproduce (for bugs)

  1. Create a rule with:
rule "mediawiki parse json"
when
  has_field("message") and
  starts_with(to_string($message.message), "{") and
  ends_with(to_string($message.message), "}") and
  contains(to_string($message.message), "application_name", true) and
  contains(to_string($message.message), "rsyslog", true)
then
  let json = parse_json(to_string($message.message));
  let map = to_map(json);
  //debug(map);
  //set_field("message", map);
  set_fields(map);

  set_field("application_name", to_string("mediawiki"));
  //set_field("source", to_string($message.host));

  let prop = select_jsonpath(json, {timestamp: "$.@timestamp"});
  let new_date = flex_parse_date(to_string(prop.timestamp));
  set_field("timestamp", new_date);
end
  1. Send a message with the contents:
{"timestamp":"2022-04-05T14:33:11+00:00", "message":" { \"timestamp\": \"1649169191.528\", \"remote_addr\": \"xxxx\", \"remote_user\": \"\", \"time_local\": \"05\/Apr\/2022:14:33:11 +0000\", \"request_method\": \"GET\", \"request_uri\": \"xxxx\", \"status\": \"304\", \"body_bytes_sent\": \"0\",\"http_x_forwarded_for\": \"xxxx, 127.0.0.1\",\"http_referrer\": \"\", \"http_user_agent\": \"xxx\", \"request_time\": \"0.297\", \"ssl_protocol\": \"TLSv1.3\", \"ssl_cipher\": \"TLS_AES_256_GCM_SHA384\", \"nginx_access_log\": true }", "host":"test101", "logsource":"test101", "severity":"info", "facility":"local7", "application_name":"nginx", "rsyslog": "true"}

and

{"timestamp":"2022-04-05T14:33:12.264358+00:00", "message":"33:12 mediawiki: {\"@timestamp\":\"2022-04-05T14:33:12.240509+00:00\",\"@version\":1,\"host\":\"test101\",\"message\":\"\",\"type\":\"mediawiki\",\"channel\":\"api-request\",\"level\":\"INFO\",\"monolog_level\":200,\"shard\":\"c4\",\"url\":\"\/w\/api.php\",\"ip\":\"xxxx\",\"http_method\":\"POST\",\"server\":\"xxx\",\"referrer\":\"xxx\",\"wiki\":\"betawiki\",\"mwversion\":\"1.37.2\",\"reqId\":\"xxxx\",\"$schema\":\"\/mediawiki\/api\/request\/1.0.0\",\"meta\":{\"request_id\":\"c098eb2b22dc010cbbd4b071\",\"id\":\"2823cc93-e94c-47a6-8558-55559625110e\",\"dt\":\"2022-04-05T14:33:12Z\",\"domain\":\"beta.betaheze.org\",\"stream\":\"mediawiki.api-request\"},\"http\":{\"method\":\"POST\",\"client_ip\":\"xxx\",\"request_headers\":{\"accept-language\":\"en-GB,en-US;q=0.9,en;q=0.8\",\"referer\":\"xxx\",\"user-agent\":\"xxx\"}},\"performer\":{\"user_text\":\"xxx\"},\"database\":\"betawiki\",\"backend_time_ms\":45,\"params\":{\"action\":\"query\",\"format\":\"json\",\"pageids\":\"1\",\"prop\":\"pagerating\",\"prcontest\":\"\"},\"private\":true}", "host":"localhost", "logsource":"localhost", "severity":"info", "facility":"user", "application_name":"2022-04-05T14", "rsyslog": "true"}

(i have a extractor rule that uses regex so it gets to the json correctly).

Context

I'm migrating to rsyslog and seems the messages are done differently compared to syslog-ng.

In rsyslog it seems to encode the json message so it has \/ which affects our extractor (it doesn't work) so i figured out i can have graylog unescape it by parsing it and writing it out again. Although there seems to be some issues as you can see with the MediaWiki message (it seems to be doing parts of the time as the application name so within the rule i want the application name set and not apply to the nginx message).

Your Environment

flightlesstux commented 2 years ago

Hi,

I found a workaround for this timestamp mismatch problem. I've set a temp_timetamp for which it comes with a log in Grok pattern.

Here is my Grok pattern starts like \[%{TIMESTAMP_ISO8601:temp_timestamp}\]

And I put a pipeline rule for my logs. I've added these lines:

  set_field("timestamp", to_date(concat(to_string(newFields.temp_timestamp), ".000")));
  remove_field("temp_timestamp");

P.S. @paladox You shouldn't use the ElasticSearch version higher than 7.11 because it will break your instance. Here is the ref: https://docs.graylog.org/docs/installing

jens-graylog commented 2 years ago

Checked and found issue exists as described in the how-to-reproduce steps. Also the workaround produces the wanted results.