Graylog2 / graylog2-server

Free and open log management
https://www.graylog.org
Other
7.34k stars 1.05k forks source link

Dots in field names are replaced silently #13043

Open mpfz0r opened 2 years ago

mpfz0r commented 2 years ago

The fact that we don't support dots in field names is nothing new. This code has been there since ages, silently replacing dots with underscores once a Message gets written to ES/OS:

https://github.com/Graylog2/graylog2-server/blob/master/graylog2-server/src/main/java/org/graylog2/plugin/Message.java#L404-L406

What is problematic, is that we neither have documentation for this behavior, nor do we log a warning if we replace dots.

The fact that field renaming happens very late in the processing can lead to some confusion when writing pipeline rules or extractors. The ingested Message will show up with underscore fields in the search, but within processing you need to work with dots.

Possible solutions

Related problems

Slash characters in field names are a similar case. They are not allowed in ES/OS. In this particular case we behave differently and ignore the field containing slash in its name, while ingesting the rest of the message. The problem has been mentioned in #12990, where we have introduced rate limited logging to inform about dropped fields with INFO level. It seems that we may need a single issue to solve all the problems related with special characters in field names, and this issue has been chosen for that.

Refs: https://github.com/Graylog2/graylog2-server/issues/12990 https://github.com/Graylog2/graylog2-server/issues/6588 https://github.com/Graylog2/graylog2-server/pull/5983

https://github.com/Graylog2/graylog2-server/issues/4583 https://github.com/elastic/elasticsearch/issues/15951

mpfz0r commented 2 years ago

[ HS #972457853 ]

coffee-squirrel commented 2 years ago

Mentioned on the case, but being able to debug($message) (or equivalent; giving a point-in-time view of the Message) would've helped greatly in figuring out what was going on.

mpfz0r commented 2 years ago

@coffee-squirrel Like this? https://github.com/Graylog2/graylog2-server/pull/13178

coffee-squirrel commented 2 years ago

@mpfz0r Nice; that'll be useful 👍 Not a big deal from our perspective, but some might look for being able to pass debug() a message parameter like certain other functions (e.g. has_field()).

mpfz0r commented 2 years ago

@coffee-squirrel Yeah, makes sense. I didn't do that in the beginning, because it required some changes to the rule parser. But it should work now.

OzzyKampha commented 7 months ago

Can this get fixed. Since the new versions of elastic have . As standard now

pandel commented 3 months ago

Will this ever be solved? The issue was created 5+ years ago and it seems, as if the whole thing hasn't been addressed yet, even though there are many people, which seem to have problems with this "dots aren't allowed" thing (think of all the Wazuh users, who wan't to replace filebeat with Graylog!)

FlatCodeIq commented 1 month ago

Hello,

We need this case to be addressed ASAP, Wazuh Dashboard is unable to read the logs due to the separator "_" instead of ".".

janheise commented 1 month ago

@pandel @FlatCodeIq What exactly is the problem with Wazuh Dashboard? I'm not familiar with that product. The . will be addressed at some point - but I can not give an estimate as it's a major change (e.g. a . in a key creates a nested object) and we have to keep backwards compatibility in mind. But we're discussing it.

pandel commented 1 month ago

@janheise Wazuh is a very decent and popular open source SIEM solution. Every data element is organized via . structured field names. The main dashboards, which are provided by the developers, are not able to find the data fields when all . are replaced with _, that simple.

Wazuh itself normally uses filebeat, but if you replace filebeat with graylog (i.e. to use CoPilot with all of its additional features), you loose capabilities which the Wazuh dashboards normally provide...