GoogleCloudPlatform / fluent-plugin-detect-exceptions

A fluentd plugin that scans line-oriented log streams and combines exceptions stacks into a single log entry.
Apache License 2.0
192 stars 69 forks source link

django python traceback not picked up #59

Closed h1771t closed 2 years ago

h1771t commented 5 years ago

Hi,

I have django in a gke pod emitting exceptions that do not get detected by this plugin. The plugin is active on my cluster. The entire exception comes into log viewer as separate lines, and error reporting view does not pick up the error. This leads me to think the exception is not picked up.

The start line of the log looks as follows: textPayload: "[Wed Sep 11 10:46:24.519502 2019] [wsgi:error] [pid 8] [client 10.28.2.1:42946] Traceback (most recent call last):, referer: https://x.x.x.x/login?next=/

Looking at the python rules (I don't know ruby) at first glance it looked like this line has the ^$ markers to match line start and end. https://github.com/GoogleCloudPlatform/fluent-plugin-detect-exceptions/blob/master/lib/fluent/plugin/exception_detector.rb#L81

Since the sample log has a bunch of things on the line before the "Traceback" word, the regex is not matching the line. Django also puts the refer at the end of the line, so $ in the regex also stops the match.

I can provide any other info needed as well. Any assistance appreciated.

h1771t commented 4 years ago

Seems like the log format comes from apache not django. The setup is django run via apache and mod wsgi, in gke pods. Apache configured normal logs to stdout and error logs to stderr. I'm not sure what you can recommend here, given that you can't try and cover every log format out there. I'm not sure if searching for the traces without the line start and line end markers is an option? I'm trying as far as possible to avoid rolling out a custom fluentd config for my gke cluster. Any advice appreciated. Thanks.

h1771t commented 4 years ago

Anything? Or would I just have to roll out custom fluentd as per https://cloud.google.com/solutions/customizing-stackdriver-logs-fluentd If so, I have to roll out custom daemonset. That would get locked to a specific version of gcr.io/stackdriver-agents/stackdriver-logging-agent docker image, how then would I track updates to the docker image at gcr.io/ released by gke to update my daemonset accordingly?

h1771t commented 4 years ago

Actually I'd have to run a custom image, not gke's. I'd have to patch fluent-plugin-detect-exceptions plugin and roll out a custom image with it patched. Is there anything that can be done here upstream to avoid me having to do this? Any advice appreciated.

h1771t commented 4 years ago

@igorpeshansky any thoughts on dropping the line start and end markers in the matches. As noted, apache and nginx etc will prepend stuff to the line that wsgi apps output. This causes line not to match. I imagine all web servers are going to do this, and in different ways.