sivagao / temp-articles

0 stars 0 forks source link

读书笔记和实验纪要 - Learning the elk stack #11

Open sivagao opened 7 years ago

sivagao commented 7 years ago

最近花了一天时间,根据 ELK 这本书(笔记如下),然后在 AWS 上的机器上搭建做了试验(机器和一些fixtures 数据在),感兴趣的同学可以到机器上继续实验。

机器如下: Host devaws.gf Hostname 54.222.242.232 User ubuntu

目录如下:

ubuntu@ip-10-71-137-164:~/projects/learning-elk-stack-book$ tree .
.
├── docker-compose.yml
├── GOOG.csv
├── logstash.conf
└── raw.csv

0 directories, 4 files
ubuntu@ip-10-71-137-164:~/projects/learning-elk-stack-book$ cat *
version: '2'
services:
  elasticsearch:
    image: elasticsearch:1.7.3

  kibana:
    image: kibana:4.1.1
    environment:
      ELASTICSEARCH_URL: http://elasticsearch:9200
    links:
      - elasticsearch
    ports:
      - "3001:5601"

1: INTRODUCTION TO ELK STACK

2: BUILDING YOUR FIRST DATA PIPELINE WITH ELK

"使用 yahoo finance 中 GOOG 股票数据来 logstash 来收集解析和传入 ES,然后使用可视化【x, y轴。row/column】 chart,最后dashboard"

3: COLLECT, PARSE AND TRANSFORM DATA WITH LOGSTASH

4: CREATING CUSTOM LOGSTASH PLUGINS

5: WHY DO WE NEED ELASTICSEARCH IN ELK?

6: FINDING INSIGHTS WITH KIBANA

7: KIBANA – VISUALIZATION AND DASHBOARD

8: PUTTING IT ALL TOGETHER

"使用 Apache log 进行一次 ELK 集成和分析"

9: ELK STACK IN PRODUCTION

1## 0: EXPANDING HORIZONS WITH ELK

实战纪要

logstash:

kopf:

Learning ELK stack

1 intro to ELK stack need for log analysis challenges in log analysis

bin/logstash -e 'input { stdin { } } output { stdout {} }' output { stdout { codec => rubydebug }} { "message" => " Hello PacktPub", "@timestamp" => "2015-05-20T23:48:05.335Z", "@version" => "1", "host" => "packtpub" } input { file { type => "apache" path => "/user/packtpub/intro-to-elk/elk.log" } } output { elasticsearch: { host = localhost} } to see indexes in Elasticsearch through: /_search logstash config 可以有不同的 section for each type of plugins. if you specify multiple filters, they are applied in the order of their appearance in the configuration file

https://hub.docker.com/_/logstash/

docker-compose up

docker run -it --rm logstash -e 'input { stdin { } } output { stdout { } }'

docker-compose stop

docker run -d --name=elasticsearch -e ES_JAVA_OPTS="-Xms4000m -Xmx4000m" elasticsearch:1.7.3

docker run -d -p 3001:5601 --link elasticsearch:elasticsearch --name kibna -e "ELASTICSEARCH_URL=http://elasticsearch:9200" kibana:4.1.1

logstash -f /config-dir/logstash.conf -t 校验配置的格式: " Unknown setting 'host' for elasticsearch" 更改格式为:hosts => ["elasticsearch"]

运行并且验证,在kibana 上可视化查看等: echo "2017-01-09,806.400024,809.966003,802.830017,806.650024,1272400,806.650024" >> GOOG.csv,动态添加等

docker run -it --rm --link elasticsearch:elasticsearch -v "$PWD":/config-dir logstash -f /config-dir/logstash.conf

使用kopf 插件 https://hub.docker.com/_/elasticsearch/ https://github.com/lmenezes/elasticsearch-kopf/tree/master/docker docker run -d -p 3000:80 -e KOPF_SERVER_NAME=grafana.dev \ -e KOPF_ES_SERVERS=es.dev:9200 --name kopf lmenezes/elasticsearch-kopf

docker run -d -p 3000:80 -e KOPF_ES_SERVERS=elasticsearch:9200 -e KOPF_SERVER_NAME=grafana.dev \ --link elasticsearch:elasticsearch --name es-kopf lmenezes/elasticsearch-kopf // 注意KOPF_ES_SERVERS包含端口号!

开始玩 kibana 图片(visuals, dashboard 照着书)

下载新的GOOG.csv数据 清空旧的脏数据,在 kopf 上可以rest client 调用es DELETE /_all curl -XDELETE 'http://localhost:9200/_all'

瞬时导入大量数据导致:1/21/2017 4:53:30 PMjava.lang.OutOfMemoryError: Java heap space

Elasticsearch uses a hybrid mmapfs / niofs directory by default to store its indices. The default operating system limits on mmap counts is likely to be too low sysctl -w vm.max_map_count=262144

ES_JAVA_OPTS="-Xms4000m -Xmx4000m"

在 rancher 中观察host机器上的docker container(即使没在rancher stack下运行),进入后可以看logs等还有负载,很有用!

做了些可视化

仅仅才2k多的document消费起来居然那么慢。。。(后续调优!!)

// 写脚本,每隔n秒,插入几个到 csv 文件然后 es 中

Putting it all together

the commonly used grok patterns are already included with the Logstash installation. .

https://github.com/logstash-plugins/logstash-patterns-core/tree/master/patterns

譬如:COMMONAPACHELOG %{IPORHOST:clientip} %{USER:ident} %{USER:auth} [%{HTTPDATE:timestamp}] "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})" %{NUMBER:response} (?:%{NUMBER:bytes}|-)

input{
  file{
    path =>"/var/lib/tomcat7/logs/localhost_access_log.txt"
    start_position =>"beginning"
  }
}

filter{
  grok {
    match => { "message" => "%{COMMONAPACHELOG}" }
  }

  date{
    match => ["timestamp","dd/MMM/yyyy:HH:mm:ss Z"]
  }

  mutate{
    convert => ["response","integer"]
    convert => ["bytes","integer"]
  }
}

output{
  elasticsearch {
    host => "localhost"
  }
}

Kibana discover 界面中的查询: 基于field 和逻辑组合 clientip:10.0.2.2 AND verb:GET

build a vertical split bar chart showing the number of requests split across multiple clients. use sub aggregation using the Split Bars feature, and split it using the clientip term:

使用 markdown 来 构建 dashboard build one Markdown to give an explanation of our Dashboard:

PS: You have to send a DELETE request to http://[your_host]:9200/[your_index_name_here] You can also delete a single document: http://[your_host]:9200/[your_index_name_here]/[your_type_here]/[your_doc_id]

截图实践