fluent-plugins-nursery / fluent-plugin-systemd

This is a fluentd input plugin. It reads logs from the systemd journal.
Apache License 2.0
153 stars 43 forks source link

Exception emitting record: "\xC2" from ASCII-8BIT to UTF-8 #26

Closed dannyk81 closed 7 years ago

dannyk81 commented 7 years ago

Using fluent-plugin-systemd version 0.0.5 on fluentd-0.12.31 to read journald logs from kube-apiserver, kubelet, kube-proxy, etc... and getting the following errors form time to time:

Jan 19 07:58:49 ldpr-tga-kub01 docker[18358]: 2017-01-19 07:58:49 +0000 [error]: Exception emitting record: "\xC2" from ASCII-8BIT to UTF-8
Jan 19 07:58:49 ldpr-tga-kub01 docker[18358]: 2017-01-19 07:58:49 +0000 [warn]: suppressed same stacktrace
Jan 19 07:58:49 ldpr-tga-kub01 docker[18358]: 2017-01-19 07:58:49 +0000 [warn]: emit transaction failed: error_class=Encoding::UndefinedConversionError error="\"\\xC2\" from ASCII-8BIT to UTF-8" tag="system.kube-apiserver"

Any ideas why ?

I stumbled upon this while trying to figure out why I stop receiving events (I use splunkapi output plugin) exactly when it turns midnight :confused:

I have other tail sources in the same setup and they work fine with the same output, I'm not sure above is related, but maybe ?

errm commented 7 years ago

potentially related to #27

dannyk81 commented 7 years ago

Indeed seems related! any leads into what's going on?

dannyk81 commented 7 years ago

@errm

Gentle ping :)

errm commented 7 years ago

Hi @dannyk81 so there seems to be two things going on here ... when the journal file is rotated logs top being ingested ... and the encoding issue in your logs.

I think these two things are unrelated but the journal file rotation issue is the root cause of the symptoms (not receiving events), the encoding thing should only cause you to lose that one event.

I haven't had the bandwidth to look at either issue yet, but may be able to over the next week at some point...

dannyk81 commented 7 years ago

Thanks for the feedback :+1:

Indeed the file rotation issue is the one that troubles me at this point.

errm commented 7 years ago

Re: the encoding issue it sounds like LANG is not setup correctly.

You probably want something like this in your Dockerfile:

RUN locale-gen en_US.UTF-8
ENV LANG en_US.UTF-8
dannyk81 commented 7 years ago

Interesting, will give it a try!

errm commented 7 years ago

Humm ignore that, it seems like fluentd is specifically overriding the system encoding...

see https://github.com/fluent/fluentd/issues/803 and https://github.com/fluent/fluentd/commit/32790dc908f9364ac3ac99eb0d2e514a8f05b909

errm commented 7 years ago

I have had a little look at this, the systemd plugin is correctly (as per the fluentd config) outputting the data with the ASCII-8BIT encoding, it seems like an output plugin that you are using is trying to convert to UTF-8, as I understand fluentd, this is a bug with that plugin as fluentd expects internally everything to be in the ASCII-8BIT encoding.

dannyk81 commented 7 years ago

Understood, I'll have to look into that plugin internals (splunkapi), though we will probably switch to Elasticsearch soon.