Closed MGD1981 closed 6 years ago
Hi, I could imagine writing the log was very slow. Is this one case or can you reproduce the issue with a log file?
Could you please set the environment variable LOGAGENT_MULTILINE_TIMEOUT_MS=350
(the default value is 250 ms). LOGAGENT_MULTILINE_TIMEOUT_MS
influences the time in milliseconds, the multiline parser waits for new input.
Thanks for the suggestion -- unfortunately we have not yet been able to replicate in a consistent fashion, so have not been able to test. But I do have a question about the default value for LOGAGENT_MULTILINE_TIMEOUT_MS
: what are the negative consequences of setting this value too high? What might happen if, for example, I set it to a full 1 second?
The agent might consume a bit more memory because more messages are buffered for a longer time. As long you don't have a very high log volume, it should not be a problem. You might notice a delay of one second before the message is processed. If you don't parse the timestamp from the log, you might see a difference in the timestamp (but I'm not 100% sure about that as we have N places dealing with timestamps).
Could you share a full stack trace? As far I see the patterns for "backend" have no blockStart
definitions for multi-line parser. Therefore the default separator for multi-line is used, which covers only indentation of log messages.
multiline:
defaultSeparator: ^\S{2,}
If the stack trace has somewhere no tab/space as first 2 characters it might be broken down into 2 messages.
For example you could add the blockStart
definition in line 13 to split log messages by the "date" of each log line:
# split messages in blocks starting with a date like "18/07/18 20:57:49"
blockStart: !!js/regexp \d{2}/\d{2}/\d{2}\s\d{2}:\d{2}:\d{2}
The example in the first message was the full stack trace, including the multiple spaces preceding each line after the first one. The default separator should pick that up, yes? Your sharing the blockStart definition may be very valuable though. Once we are able to consistently replicate we will test it!
I am still not sure what happened in my original case. I have tried to fake replication by creating a container which prints out the contents of the original stack trace every 30 seconds, and setting LOGAGENT_MULTILINE_TIMEOUT_MS
to 1 (can it go lower?) but it is still not cutting off the multilines like we saw in our production instance. In the meantime I suppose we should close the ticket and I'll let you know if I am ever able to replicate. Thanks for your help!
Alright, let us know then :)
Just FYI, the blockStart config you gave me has definitely fixed the issue, and we are getting full stack traces now all in the same log line -- just had a tweak it a little (adding escape characters for slashes, colons). Thanks again!
Awesome, thanks for following up!
The following log only had its first two lines (up to and including "java.lang.NullPointerException: null" but nothing after that) ingested into sematext docker agent, without the rest of the stack trace (and these were ingested as two separate log entries):
I have attached my patterns.yml (compressed into a zip for github) which does contain a custom pattern for our backend service which was derived for us by Radu Gheorghe a few months ago to allow that service's logs' severity to be parsed (lines 12-20 in the file).
patterns.zip