Graylog2 / graylog2-server

Free and open log management
https://www.graylog.org
Other
7.37k stars 1.06k forks source link

pipeline: json message crash processing - loosing message in processing. #8617

Closed jalogisch closed 3 years ago

jalogisch commented 4 years ago

Current Behavior

Having the rule & pipeline running in the simulator does work without any problems, only if you run that processing the message is dropped. Using the reproduce steps will create the following error:

2020-07-24T12:38:55.750+02:00 WARN  [JsonParse] Unable to parse JSON
com.fasterxml.jackson.core.JsonParseException: Unrecognized token 'check': was expecting ('true', 'false' or 'null')
 at [Source: (String)"[check-result][2] marking unavailable shards as stale: [3SVpryx_TPqp_qZI3_wl6A]"; line: 1, column: 7]
    at com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1804) ~[graylog.jar:?]
    at com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:703) ~[graylog.jar:?]
    at com.fasterxml.jackson.core.json.ReaderBasedJsonParser._reportInvalidToken(ReaderBasedJsonParser.java:2853) ~[graylog.jar:?]
    at com.fasterxml.jackson.core.json.ReaderBasedJsonParser._handleOddValue(ReaderBasedJsonParser.java:1899) ~[graylog.jar:?]
    at com.fasterxml.jackson.core.json.ReaderBasedJsonParser.nextToken(ReaderBasedJsonParser.java:757) ~[graylog.jar:?]
    at com.fasterxml.jackson.databind.deser.std.BaseNodeDeserializer.deserializeArray(JsonNodeDeserializer.java:431) ~[graylog.jar:?]
    at com.fasterxml.jackson.databind.deser.std.JsonNodeDeserializer.deserialize(JsonNodeDeserializer.java:70) ~[graylog.jar:?]
    at com.fasterxml.jackson.databind.deser.std.JsonNodeDeserializer.deserialize(JsonNodeDeserializer.java:15) ~[graylog.jar:?]
    at com.fasterxml.jackson.databind.ObjectMapper._readTreeAndClose(ObjectMapper.java:4057) ~[graylog.jar:?]
    at com.fasterxml.jackson.databind.ObjectMapper.readTree(ObjectMapper.java:2552) ~[graylog.jar:?]
    at org.graylog.plugins.pipelineprocessor.functions.json.JsonParse.evaluate(JsonParse.java:53) [graylog.jar:?]
    at org.graylog.plugins.pipelineprocessor.functions.json.JsonParse.evaluate(JsonParse.java:36) [graylog.jar:?]
    at org.graylog.plugins.pipelineprocessor.ast.expressions.FunctionExpression.evaluateUnsafe(FunctionExpression.java:63) [graylog.jar:?]
    at org.graylog.plugins.pipelineprocessor.ast.functions.ParameterDescriptor.required(ParameterDescriptor.java:116) [graylog.jar:?]
    at org.graylog.plugins.pipelineprocessor.functions.json.IsJson.evaluate(IsJson.java:39) [graylog.jar:?]
    at org.graylog.plugins.pipelineprocessor.functions.json.IsJson.evaluate(IsJson.java:28) [graylog.jar:?]
    at org.graylog.plugins.pipelineprocessor.ast.expressions.FunctionExpression.evaluateUnsafe(FunctionExpression.java:63) [graylog.jar:?]
    at org.graylog.plugins.pipelineprocessor.ast.expressions.EqualityExpression.evaluateBool(EqualityExpression.java:47) [graylog.jar:?]
    at org.graylog.plugins.pipelineprocessor.processors.PipelineInterpreter.evaluateRuleCondition(PipelineInterpreter.java:407) [graylog.jar:?]
    at org.graylog.plugins.pipelineprocessor.processors.PipelineInterpreter.evaluateStage(PipelineInterpreter.java:303) [graylog.jar:?]
    at org.graylog.plugins.pipelineprocessor.processors.PipelineInterpreter.processForResolvedPipelines(PipelineInterpreter.java:267) [graylog.jar:?]
    at org.graylog.plugins.pipelineprocessor.processors.PipelineInterpreter.process(PipelineInterpreter.java:147) [graylog.jar:?]
    at org.graylog.plugins.pipelineprocessor.processors.PipelineInterpreter.process(PipelineInterpreter.java:103) [graylog.jar:?]
    at org.graylog2.shared.buffers.processors.ProcessBufferProcessor.handleMessage(ProcessBufferProcessor.java:136) [graylog.jar:?]
    at org.graylog2.shared.buffers.processors.ProcessBufferProcessor.dispatchMessage(ProcessBufferProcessor.java:121) [graylog.jar:?]
    at org.graylog2.shared.buffers.processors.ProcessBufferProcessor.onEvent(ProcessBufferProcessor.java:93) [graylog.jar:?]
    at org.graylog2.shared.buffers.processors.ProcessBufferProcessor.onEvent(ProcessBufferProcessor.java:47) [graylog.jar:?]
    at com.lmax.disruptor.WorkProcessor.run(WorkProcessor.java:143) [graylog.jar:?]
    at com.codahale.metrics.InstrumentedThreadFactory$InstrumentedRunnable.run(InstrumentedThreadFactory.java:66) [graylog.jar:?]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252]
2020-07-24T12:38:55.751+02:00 WARN  [JsonParse] Unable to parse JSON
com.fasterxml.jackson.core.JsonParseException: Unrecognized token 'check': was expecting ('true', 'false' or 'null')
 at [Source: (String)"[check-result][2] marking unavailable shards as stale: [3SVpryx_TPqp_qZI3_wl6A]"; line: 1, column: 7]
    at com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1804) ~[graylog.jar:?]
    at com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:703) ~[graylog.jar:?]
    at com.fasterxml.jackson.core.json.ReaderBasedJsonParser._reportInvalidToken(ReaderBasedJsonParser.java:2853) ~[graylog.jar:?]
    at com.fasterxml.jackson.core.json.ReaderBasedJsonParser._handleOddValue(ReaderBasedJsonParser.java:1899) ~[graylog.jar:?]
    at com.fasterxml.jackson.core.json.ReaderBasedJsonParser.nextToken(ReaderBasedJsonParser.java:757) ~[graylog.jar:?]
    at com.fasterxml.jackson.databind.deser.std.BaseNodeDeserializer.deserializeArray(JsonNodeDeserializer.java:431) ~[graylog.jar:?]
    at com.fasterxml.jackson.databind.deser.std.JsonNodeDeserializer.deserialize(JsonNodeDeserializer.java:70) ~[graylog.jar:?]
    at com.fasterxml.jackson.databind.deser.std.JsonNodeDeserializer.deserialize(JsonNodeDeserializer.java:15) ~[graylog.jar:?]
    at com.fasterxml.jackson.databind.ObjectMapper._readTreeAndClose(ObjectMapper.java:4057) ~[graylog.jar:?]
    at com.fasterxml.jackson.databind.ObjectMapper.readTree(ObjectMapper.java:2552) ~[graylog.jar:?]
    at org.graylog.plugins.pipelineprocessor.functions.json.JsonParse.evaluate(JsonParse.java:53) [graylog.jar:?]
    at org.graylog.plugins.pipelineprocessor.functions.json.JsonParse.evaluate(JsonParse.java:36) [graylog.jar:?]
    at org.graylog.plugins.pipelineprocessor.ast.expressions.FunctionExpression.evaluateUnsafe(FunctionExpression.java:63) [graylog.jar:?]
    at org.graylog.plugins.pipelineprocessor.ast.expressions.Expression.evaluate(Expression.java:41) [graylog.jar:?]
    at org.graylog.plugins.pipelineprocessor.ast.statements.VarAssignStatement.evaluate(VarAssignStatement.java:33) [graylog.jar:?]
    at org.graylog.plugins.pipelineprocessor.ast.statements.VarAssignStatement.evaluate(VarAssignStatement.java:22) [graylog.jar:?]
    at org.graylog.plugins.pipelineprocessor.processors.PipelineInterpreter.evaluateStatement(PipelineInterpreter.java:385) [graylog.jar:?]
    at org.graylog.plugins.pipelineprocessor.processors.PipelineInterpreter.executeRuleActions(PipelineInterpreter.java:369) [graylog.jar:?]
    at org.graylog.plugins.pipelineprocessor.processors.PipelineInterpreter.evaluateStage(PipelineInterpreter.java:309) [graylog.jar:?]
    at org.graylog.plugins.pipelineprocessor.processors.PipelineInterpreter.processForResolvedPipelines(PipelineInterpreter.java:267) [graylog.jar:?]
    at org.graylog.plugins.pipelineprocessor.processors.PipelineInterpreter.process(PipelineInterpreter.java:147) [graylog.jar:?]
    at org.graylog.plugins.pipelineprocessor.processors.PipelineInterpreter.process(PipelineInterpreter.java:103) [graylog.jar:?]
    at org.graylog2.shared.buffers.processors.ProcessBufferProcessor.handleMessage(ProcessBufferProcessor.java:136) [graylog.jar:?]
    at org.graylog2.shared.buffers.processors.ProcessBufferProcessor.dispatchMessage(ProcessBufferProcessor.java:121) [graylog.jar:?]
    at org.graylog2.shared.buffers.processors.ProcessBufferProcessor.onEvent(ProcessBufferProcessor.java:93) [graylog.jar:?]
    at org.graylog2.shared.buffers.processors.ProcessBufferProcessor.onEvent(ProcessBufferProcessor.java:47) [graylog.jar:?]
    at com.lmax.disruptor.WorkProcessor.run(WorkProcessor.java:143) [graylog.jar:?]
    at com.codahale.metrics.InstrumentedThreadFactory$InstrumentedRunnable.run(InstrumentedThreadFactory.java:66) [graylog.jar:?]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252]

The message is lost in processing after this.

Steps to Reproduce (for bugs)

  1. use the following rule in a pipeline
    rule "extract json from field message"
    when
    is_json(parse_json(to_string($message.message))) == true
    //starts_with(to_string($message.message), "{") && ends_with(to_string($message.message), "}")
    then
    set_field("original_message", to_string($message.message));
    let json = parse_json(to_string($message.message));
    let map = to_map(json);
    set_fields(map);
    set_field("DEBUG", true);
    end
  2. send the following message (in this case RAW Input)
    
    echo -n '{"type": "server", "timestamp": "2020-07-21T12:57:14,875+02:00", "level": "WARN", "component": "o.e.c.r.a.AllocationService", "cluster.name": "es4mong2-preprod", "node.name": "vox.ada.de", "message": "[check-result][2] marking unavailable shards as stale: [3SVpryx_TPqp_qZI3_wl6A]", "cluster.uuid": "g08V_RpuRO-URM1daup9aQ", "node.id": "RzJre1HLTH-xqCqgk3jClA" } ' | nc -w0 192.168.195.50 5555

## Context
The JSON parsing in Graylog is not stable and extractor and processing pipeline do work different. We should improve that at least in a way that no messages are lost. You can only make this work if the messages are known always by making use of a rule like the following, but that is not working if you do not know the messages you expect. 

rule "extractvia jsonpath - with stacktrace" when is_json(parse_json(to_string($message.message))) == true && contains(to_string($message.message), "stacktrace") then set_field("original_message", to_string($message.message)); let json = parse_json(to_string($message.message)); let new_fields = select_jsonpath(json, { time: "$.timestamp", type: "$.type", loglevel: "$.level", component: "$.component", message: "$.message", cluster_name: "$.['cluster.name']", node_name: "$.['node.name']", cluster_uuid: "$.['cluster.uuid']", node_id: "$.['node.id']", stacktrace: "$.stacktrace" } ); set_fields(new_fields); end



## Your Environment

* Graylog Version: 3.3

[Z#843703]
kroepke commented 4 years ago

I cannot reproduce this issue, neither in a 3.3.2 test instance in pipelines and netcat, nor in master or 3.3 unit tests.

xtruthx commented 4 years ago

I have a similiar Warning in my envrionment but my messages are not lost and are parsed by the select_jsonpath Function

I have the Following:

2020-07-30T14:37:08.920+02:00 WARN  [JsonParse] Unable to parse JSON
com.fasterxml.jackson.core.JsonParseException: Unrecognized token 'Virtual': was expecting ('true', 'false' or 'null')
 at [Source: (String)"Virtual field 'Barva' not added, because a real field with this name exists already."; line: 1, column: 8]
        at com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1804) ~[graylog.jar:?]
        at com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:703) ~[graylog.jar:?]
        at com.fasterxml.jackson.core.json.ReaderBasedJsonParser._reportInvalidToken(ReaderBasedJsonParser.java:2853) ~[graylog.jar:?]
        at com.fasterxml.jackson.core.json.ReaderBasedJsonParser._handleOddValue(ReaderBasedJsonParser.java:1899) ~[graylog.jar:?]
        at com.fasterxml.jackson.core.json.ReaderBasedJsonParser.nextToken(ReaderBasedJsonParser.java:757) ~[graylog.jar:?]
        at com.fasterxml.jackson.databind.ObjectMapper._readTreeAndClose(ObjectMapper.java:4043) ~[graylog.jar:?]
        at com.fasterxml.jackson.databind.ObjectMapper.readTree(ObjectMapper.java:2552) ~[graylog.jar:?]
        at org.graylog.plugins.pipelineprocessor.functions.json.JsonParse.evaluate(JsonParse.java:53) [graylog.jar:?]
        at org.graylog.plugins.pipelineprocessor.functions.json.JsonParse.evaluate(JsonParse.java:36) [graylog.jar:?]
        at org.graylog.plugins.pipelineprocessor.ast.expressions.FunctionExpression.evaluateUnsafe(FunctionExpression.java:63) [graylog.jar:?]
        at org.graylog.plugins.pipelineprocessor.ast.functions.ParameterDescriptor.required(ParameterDescriptor.java:116) [graylog.jar:?]
        at org.graylog.plugins.pipelineprocessor.functions.conversion.MapConversion.evaluate(MapConversion.java:48) [graylog.jar:?]
        at org.graylog.plugins.pipelineprocessor.functions.conversion.MapConversion.evaluate(MapConversion.java:33) [graylog.jar:?]
        at org.graylog.plugins.pipelineprocessor.ast.expressions.FunctionExpression.evaluateUnsafe(FunctionExpression.java:63) [graylog.jar:?]
        at org.graylog.plugins.pipelineprocessor.ast.functions.ParameterDescriptor.required(ParameterDescriptor.java:116) [graylog.jar:?]
        at org.graylog.plugins.pipelineprocessor.functions.messages.SetFields.evaluate(SetFields.java:52) [graylog.jar:?]
        at org.graylog.plugins.pipelineprocessor.functions.messages.SetFields.evaluate(SetFields.java:33) [graylog.jar:?]
        at org.graylog.plugins.pipelineprocessor.ast.expressions.FunctionExpression.evaluateUnsafe(FunctionExpression.java:63) [graylog.jar:?]
        at org.graylog.plugins.pipelineprocessor.ast.expressions.Expression.evaluate(Expression.java:41) [graylog.jar:?]
        at org.graylog.plugins.pipelineprocessor.ast.statements.FunctionStatement.evaluate(FunctionStatement.java:32) [graylog.jar:?]
        at org.graylog.plugins.pipelineprocessor.processors.PipelineInterpreter.evaluateStatement(PipelineInterpreter.java:385) [graylog.jar:?]
        at org.graylog.plugins.pipelineprocessor.processors.PipelineInterpreter.executeRuleActions(PipelineInterpreter.java:369) [graylog.jar:?]
        at org.graylog.plugins.pipelineprocessor.processors.PipelineInterpreter.evaluateStage(PipelineInterpreter.java:309) [graylog.jar:?]
        at org.graylog.plugins.pipelineprocessor.processors.PipelineInterpreter.processForResolvedPipelines(PipelineInterpreter.java:267) [graylog.jar:?]
        at org.graylog.plugins.pipelineprocessor.processors.PipelineInterpreter.process(PipelineInterpreter.java:147) [graylog.jar:?]
        at org.graylog.plugins.pipelineprocessor.processors.PipelineInterpreter.process(PipelineInterpreter.java:103) [graylog.jar:?]
        at org.graylog2.shared.buffers.processors.ProcessBufferProcessor.handleMessage(ProcessBufferProcessor.java:136) [graylog.jar:?]
        at org.graylog2.shared.buffers.processors.ProcessBufferProcessor.dispatchMessage(ProcessBufferProcessor.java:121) [graylog.jar:?]
        at org.graylog2.shared.buffers.processors.ProcessBufferProcessor.onEvent(ProcessBufferProcessor.java:93) [graylog.jar:?]
        at org.graylog2.shared.buffers.processors.ProcessBufferProcessor.onEvent(ProcessBufferProcessor.java:47) [graylog.jar:?]
        at com.lmax.disruptor.WorkProcessor.run(WorkProcessor.java:143) [graylog.jar:?]
        at com.codahale.metrics.InstrumentedThreadFactory$InstrumentedRunnable.run(InstrumentedThreadFactory.java:66) [graylog.jar:?]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_262]

In this case "Virtual field 'Barva' not added, because a real field with this name exists already." is the content of the message field or field value in the JSON object. It seems that the parser has trouble with the single quotes?

patrickmann commented 3 years ago

The sample from jalogisch has these issues:

Once I fixed / removed those fields, the message is parsed without issues. The rule is applied successfully and creates the expected fields and values.

I could not reproduce the failure reported by xtruthx.

I tested this in both version 3.3 and 4.2. Incidentally, issues like this are now much easier to debug using the Failures stream introduced in 4.2.