Graylog2 / graylog2-server

Free and open log management
https://www.graylog.org
Other
7.4k stars 1.06k forks source link

JSON Parsed and prefixed content fails to parse timestamp #5746

Closed f0o closed 5 years ago

f0o commented 5 years ago

We parse and prefix fields from json gelf input (fluentd kubernetes+application) The field in question ends up being called app_context_timestamp and it's JSON counterpart from the original input is:

{
 ...,
 message: {
  ...,
  context: {
   ...,
   timestamp: "2019-03-05 11:58:36.229161",
  }
 }
}

Expected Behavior

Take the field verbatim as String as it's called app_context_timestamp and not timestamp

Current Behavior

Message is being discarded by parsing error marked as warning

Possible Solution

Treat it verbatim if you can't parse it. Strings are fine.

Context

Logs in question are from fluentd forwarding OpenStack logs as json.

Your Environment

jalogisch commented 5 years ago

Could you please give us a little more backround, as Graylog does not have a JSON GELF Input.

What kind of input did you use, how did you parse the message? Did you use extractors or processing pipelines? If the second how does they look like, what rules did you have?

thank you

f0o commented 5 years ago

The application logs JSON into file, fluentd pulls it and wraps it into the GELF format (which is just JSON again, but marks the message as string) then Graylog uses the JSON extractor to parse the message field of the GELF input and prepends the extracted JSON structure with app_.

However, it still tries to interpret the timestamp in field app_context_timestamp for whatever reason.

The issue here is that we cannot alter the format of that field in the application (OpenStack Neutron here) and a lot of messages are being discarded just because it fails to parse that one date.

We're absolutely fine with seeing that date as a string because we do not care about that date at all.

kroepke commented 5 years ago

@f0o I believe this because of elasticsearch's dynamic mapping date detection:

https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic-field-mapping.html#date-detection

If that's true, the fix would be to create custom mapping template to turn that off for this field.

f0o commented 5 years ago

@kroepke if I understand the document right these mappings are per index. That would mean that I need to write some sort of hook to make graylog disable it on the new indexes it creates, right?

//Edit: But seeing that those mappings must come from somewhere (I guess graylog?), I'd appreciate if I can just treat that field as text or if index.mapping.ignore_malformed is defaulted to true on all indexes graylog creates

f0o commented 5 years ago

I followed http://docs.graylog.org/en/3.0/pages/configuration/elasticsearch.html#custom-index-mappings now to handle the field as keyword/string instead of date.

I tried adding the setting index.mapping.ignore_malformed to the template but it's now shown in the merged templates at the deflector index, so I guess I will have to keep an eye on the logs and add more fields as they pop up.

I'll consider this issue """resolved""". A way to enforce ignore_malformed on a global scale would be ideal tho