Qihoo360 / logkafka

Collect logs and send lines to Apache Kafka
Other
498 stars 115 forks source link

996.icu LICENSE

logkafka - Collect logs and send lines to Apache Kafka 0.8+

Introduction 中文文档

logkafka sends log file contents to kafka 0.8 line by line. It treats one line of file as one kafka message.

See FAQ if you wanna deploy it in production environment.

logkafka

Gitter

Features

Differences with other log aggregation and monitoring tools

The main differences with flume, fluentd, logstash are

Users of logkafka

details...

Supported operating systems

Requirements

Build

Two methods, choose accordingly.

  1. Install librdkafka(>0.8.6), libzookeeper_mt, libuv(>v1.6.0), libpcre2(>10.20) manually, then

    cmake -H. -B_build -DCMAKE_INSTALL_PREFIX=_install
    cd _build
    make -j4
    make install
  2. Just let cmake handle the dependencies ( cmake version >= 3.0.2 ).

    cmake -H. -B_build -DCMAKE_INSTALL_PREFIX=_install \
                       -DINSTALL_LIBRDKAFKA=ON \
                       -DINSTALL_LIBZOOKEEPER_MT=ON \
                       -DINSTALL_LIBUV=ON \
                       -DINSTALL_LIBPCRE2=ON
    cd _build
    make -j4
    make install

    If any of the libs installation fail, please manually install it, and set the corresponding config -DINSTALL_LIBXXX=OFF.

Usage

Note: If you already have kafka and zookeeper installed, you can start from step 2 and replace zk connection string with your own in the following steps, default is 127.0.0.1:2181.

  1. Deploy Kafka and Zookeeper in local host

    tools/grid bootstrap
  2. Start logkafka

    • local conf

    Customizing _install/conf/logkafka.conf to your needs

    zookeeper.connect = 127.0.0.1:2181
    pos.path       = ../data/pos.myClusterName
    line.max.bytes = 1048576
    ...
    • run

    Run in the foreground

    _install/bin/logkafka -f _install/conf/logkafka.conf -e _install/conf/easylogging.conf

    Or as a daemon

    _install/bin/logkafka --daemon -f _install/conf/logkafka.conf -e _install/conf/easylogging.conf
  3. Configs Management

    Use UI or command line tools.

    3.1 UI (with kafka-manager)

    We add logkafka as one kafka-manager extension. You need to install and start kafka-manager, add cluster with logkafka enabled, then you can manage logkafka with the 'Logkafka' menu.

    • How to add cluster with logkafka enabled

    logkafka

    • How to create new config

    logkafka

    • How to delete configs

    logkafka

    • How to list configs and monitor sending progress

    logkafka

    3.2 Command line tools

    We use php script (tools/log_config.php) to create/delete/list collecting configurations in zookeeper nodes.

    If you do not know how to install php zookeeper module, check this.

    • How to create configs

      Example:

      Collect apache access log on host "test.qihoo.net" to kafka brokers with zk connection string "127.0.0.1:2181". The topic is "apache_access_log".

      php tools/log_config.php --create \
                              --zookeeper_connect=127.0.0.1:2181 \
                              --logkafka_id=test.qihoo.net \
                              --log_path=/usr/local/apache2/logs/access_log.%Y%m%d \
                              --topic=apache_access_log

      Note:

      • [hosname, log_path] is the key of one config.
    • How to delete configs

      php tools/log_config.php --delete \
                              --zookeeper_connect=127.0.0.1:2181 \
                              --logkafka_id=test.qihoo.net \
                              --log_path=/usr/local/apache2/logs/access_log.%Y%m%d
    • How to list configs and monitor sending progress

      php tools/log_config.php --list --zookeeper_connect=127.0.0.1:2181

      shows

      logkafka_id: test.qihoo.net
      log_path: /usr/local/apache2/logs/access_log.%Y%m%d
      Array
      (
          [conf] => Array
              (
                  [logkafka_id] => test.qihoo.net
                  [log_path] => /usr/local/apache2/logs/access_log.%Y%m%d
                  [topic] => apache_access_log
                  [partition] => -1
                  [key] =>
                  [required_acks] => 1
                  [compression_codec] => none
                  [batchsize] => 1000
                  [message_timeout_ms] => 0
                  [follow_last] => 1
                  [valid] => 1
              )
      
      )

    More details about configuration management, see php tools/log_config.php --help.

Benchmark

We test with 2 brokers, 2 partitions

Name Description
rtt min/avg/max/mdev 0.478/0.665/1.004/0.139 ms
message average size 1000 bytes
batchsize 1000
required_acks 1
compression_codec none
message_timeout_ms 0
peak rates 20.5 Mb/s

Third Party

The most significant third party packages are:

Thanks to the creators of these packages.

Developers

  1. Make sure you have lcov installed, check this

compile with unittest and debug type

cmake -H. -B_build -DCMAKE_INSTALL_PREFIX=_install \
                   -Dtest=ON \
                   -DCMAKE_BUILD_TYPE=Debug
cd _build
make
make logkafka_coverage  # run unittest
  1. Google C++ Style Guide

The code that not conform to this rule should be fixed before committing, you can use cpplint to check the modified files.

TODO

  1. Multi-line mode