logstash-plugins / logstash-patterns-core

Apache License 2.0
2.17k stars 979 forks source link

new Grok Pattern to match multiline strings, e.g. Stacktraces #314

Open stoerr opened 2 years ago

stoerr commented 2 years ago

I'd like to suggest introducing a pattern like ((?m).*) that matches a multiline string, such as what you'd get if you apply the multline filter before the grok filter in logstash to capture e.g. stacktraces in a Java logfile.

Background: When testing my update of GrokConstructor, I noticed that there is still no pattern in the Grok pattern library that would match a stacktrace, as in https://grokconstructor.appspot.com/do/construction?example=0 . Thus it's not possible for the incremental construction to suggest something sensible that'd match the full message for something like that:

2013-02-28 09:57:56,668 ERROR SomeCallLogger - ESS10005 Cpc portalservices: Exception caught while writing log messege to MEA Call:  {}
java.sql.SQLSyntaxErrorException: ORA-00942: table or view does not exist

    at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:445)
    at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:396)

Here you are pretty much stuck in the construction process after \A%{TIMESTAMP_ISO8601}%{SPACE}%{LOGLEVEL}%{SPACE}%{JAVACLASS}%{SPACE}%{JAVALOGMESSAGE} and has to use a handmade regular expression like ((?m).*) to match the stacktrace, which likely isn't something everybody can type in without some research.

So my suggestion would be to introduce such a pattern ((?m).*) in the grok pattern library. Not sure what to name it: it could be named STACKTRACE, but I've often seen log messages with several lines that do not have a stacktrace, so something like MULTILINE_REST, GREEDY_MULTILINE, MULTILINE_DATA might be more appropriate.

Thank you!