flant / loghouse

Ready to use log management solution for Kubernetes storing data in ClickHouse and providing web UI.
Apache License 2.0
925 stars 76 forks source link

In fluentd Code: 117. DB::Exception: Unknown field found while parsing JSONEachRow format: _SELINUX_CONTEXT #19

Open yurchenkosv opened 7 years ago

yurchenkosv commented 7 years ago

I have following in logs when deploy loghouse via helm:

Code: 117. DB::Exception: Unknown field found while parsing JSONEachRow format: _SELINUX_CONTEXT
 2017-11-08 05:28:55 +0000 [warn]: temporarily failed to flush the buffer. next_retry=2017-11-08 05:28:55 +0000 error_class="RuntimeError" error="command returns 29952: bash /usr/local/bin/insert_ch.sh /tmp/fluent-plugin-exec-20171108-7-kfyf2o" plugin_id="object:3fce70caaa48"
  2017-11-08 05:28:55 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluentd-0.12.40/lib/fluent/plugin/out_exec.rb:104:in `write'
  2017-11-08 05:28:55 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluentd-0.12.40/lib/fluent/buffer.rb:354:in `write_chunk'
  2017-11-08 05:28:55 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluentd-0.12.40/lib/fluent/buffer.rb:333:in `pop'
  2017-11-08 05:28:55 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluentd-0.12.40/lib/fluent/output.rb:342:in `try_flush'
  2017-11-08 05:28:55 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluentd-0.12.40/lib/fluent/output.rb:149:in `run'
Code: 117. DB::Exception: Unknown field found while parsing JSONEachRow format: _SELINUX_CONTEXT
 2017-11-08 05:28:55 +0000 [warn]: temporarily failed to flush the buffer. next_retry=2017-11-08 05:28:57 +0000 error_class="RuntimeError" error="command returns 29952: bash /usr/local/bin/insert_ch.sh /tmp/fluent-plugin-exec-20171108-7-fm3rdj" plugin_id="object:3fce70caaa48"
  2017-11-08 05:28:55 +0000 [warn]: suppressed same stacktrace
Code: 117. DB::Exception: Unknown field found while parsing JSONEachRow format: _SELINUX_CONTEXT
 2017-11-08 05:28:57 +0000 [warn]: temporarily failed to flush the buffer. next_retry=2017-11-08 05:29:01 +0000 error_class="RuntimeError" error="command returns 29952: bash /usr/local/bin/insert_ch.sh /tmp/fluent-plugin-exec-20171108-7-1tfk6fi" plugin_id="object:3fce70caaa48"

What with it could be connected?

yurchenkosv commented 7 years ago

Сделали проверку, что это за ошибка, путем модификации скрипта insert_ch.sh Получили следующий вывод в консоль:

2017-11-08 07:12:42 +0000 [warn]: suppressed same stacktrace
Code: 117. DB::Exception: Unknown field found while parsing JSONEachRow format: _SELINUX_CONTEXT
 2017-11-08 07:29:19 +0000 [warn]: temporarily failed to flush the buffer. next_retry=2017-11-08 08:04:59 +0000 error_class="RuntimeError" error="command returns 29952: bash /usr/local/bin/insert_ch.sh /tmp/fluent-plugin-exec-20171108-7-11q1bjb" plugin_id="object:3f9d896aa8cc"
  2017-11-08 07:29:19 +0000 [warn]: suppressed same stacktrace

// Это получили после модификации скрипта
{"_SELINUX_CONTEXT":"system_u:system_r:kernel_t:s0","_SYSTEMD_INVOCATION_ID":"dabc3bc11ebc4f8db8fc4d98fb8b235e","timestamp":1503909212,"source":"docker","host":"kubernetes-worker1","stream":"stdout","string_fields.names":["log"],"string_fields.values":["time=\"2017-08-28T08:33:32.389934701Z\" level=info msg=\"API listen on /var/run/docker.sock\""]}

Code: 117. DB::Exception: Unknown field found while parsing JSONEachRow format: _SELINUX_CONTEXT
 2017-11-08 08:04:59 +0000 [warn]: temporarily failed to flush the buffer. next_retry=2017-11-08 09:06:11 +0000 error_class="RuntimeError" error="command returns 29952: bash /usr/local/bin/insert_ch.sh /tmp/fluent-plugin-exec-20171108-7-1kudxtx" plugin_id="object:3f9d896aa8cc"
  2017-11-08 08:04:59 +0000 [warn]: suppressed same stacktrace

Судя по всему, скрипт пытается запихнуть в БД, данные, которые не соответствуют структуре данных в таблице, поскольку содержат дополнительные ключи _SELINUX_CONTEXT, _SYSTEMD_INVOCATION_ID

Ожидаемая структура данных:

timestamp       DateTime
nsec    UInt32
source  String
namespace       String
host    String
pod_name        String
container_name  String
stream  String
labels.names    Array(String)
labels.values   Array(String)
string_fields.names     Array(String)
string_fields.values    Array(String)
number_fields.names     Array(String)
number_fields.values    Array(Float64)
boolean_fields.names    Array(String)
boolean_fields.values   Array(Float64)
null_fields.names       Array(String)

Я не специалист во fluentd, но возможно, поможет модификация конфига в части обработки лог сообщения, как сделано здесь:

<filter kubelet.**>
  @type record_modifier
  <record>
    timestamp              ${time}
    source                 "kubelet"
    host                   ${record["_HOSTNAME"]}
    stream                 ${record["_TRANSPORT"]}
    string_fields.names    ${["log"]}
    string_fields.values   ${[record["MESSAGE"]]}
  </record>
  remove_keys _TRANSPORT, PRIORITY, SYSLOG_FACILITY, _UID, _GID, _CAP_EFFECTIVE, _SYSTEMD_SLICE, _BOOT_ID, _MACHINE_ID, _HOSTNAME, SYSLOG_IDENTIFIER, _PID, _COMM, _EXE, _CMDLINE, _SYSTEMD_CGROUP, _SYSTEMD_UNIT, MESSAGE
</filter>

Как видно из конфига, запись формируется из конкретных полей лог сообщения, а невалидные ключи удаляются.

yurchenkosv commented 7 years ago

Решили проблему путем добавления фильтра в fluentd:

    <filter **>
      @type record_modifier
      whitelist_keys timestamp, nsec, source, namespace, host, pod_name, container_name, stream, labels.names, labels.values, string_fields.names, string_fields.values, number_fields.names, number_fields.values, boolean_fields.names, boolean_fields.values, null_fields.names 
    </filter>