cloudviz / agentless-system-crawler

A tool to crawl systems like crawlers for the web
Apache License 2.0
116 stars 44 forks source link

Added ElasticSearch Emitter to index Frames into an ElasticSearch Index #394

Closed maheshbabugorantla closed 4 years ago

maheshbabugorantla commented 4 years ago

ElasticSearch Emitter

Resolves Issue #392

Summary of Changes

  1. Added Elasticsearch Emitter
  2. Added UnitTest to test formatting of crawler frames into elasticsearch document format

Results

Environment Setup

$ docker ps
CONTAINER ID        IMAGE                 COMMAND                  CREATED             STATUS              PORTS                                            NAMES
e96b77434155        kibana:7.4.2          "/usr/local/bin/dumb…"   21 hours ago        Up 21 hours         0.0.0.0:5601->5601/tcp                           tender_dijkstra
a658b6357af5        elasticsearch:7.4.2   "/usr/local/bin/dock…"   6 months ago        Up 22 hours         0.0.0.0:9200->9200/tcp, 0.0.0.0:9300->9300/tcp   es_emitter_test

Crawling and Indexing frames in INVM Crawl Mode

$ sudo venv/bin/python crawler/crawler.py --url elastic://localhost:9200 --features os,disk,process,package --extraMetadata '{"iteration_number": 1, "hostname": "my_ubuntu_1804"}' --format json

$ sudo venv/bin/python crawler/crawler.py --url elastic://localhost:9200 --features os,disk,process,package --extraMetadata '{"iteration_number": 2, "hostname": "my_ubuntu_1804"}' --format json

Kibana Query (Filtering by extraMetadata fields)

hostname : "my_ubuntu_1804" and iteration_number : "1" elastic_emitter_iteration_1

hostname : "my_ubuntu_1804" and iteration_number : "2" elastic_emitter_iteration_2

maheshbabugorantla commented 4 years ago

@sahilsuneja1, The problem seems to persist. Let me squash all the commits, sign that one commit and recreate a new PR again