cloudviz / agentless-system-crawler

A tool to crawl systems like crawlers for the web
Apache License 2.0
117 stars 44 forks source link

Make it easier to visualize collected data #66

Closed ricarkol closed 7 years ago

ricarkol commented 8 years ago

It would be awesome to have something like this:

make start-crawler-and-elk # or ./crawl.sh all or ./launch.sh all

Or something similar that starts a crawler for containers (in a container), and an ELK stack in a container (use https://github.com/cloudviz/crawler-elk-stack) receiving data from that crawler. The output of that make start-crawler-and-elk should be a URL for where to look at the pretty data (some kibana dashboard).

nadgowdas commented 8 years ago

Currently, if we use 'url' emitter with 'http' target of lagstash, the whole crawl frame is collected by logstash and dumped to ES and can be viewed on kibana (on standard elk stack). But, whole frame is a single string so it's hard to parse it.

Proposal: I am going to add a new emitter which output every json (every feature) from frame as a separate entity to logstash. And i will add namespace to every json output so that they can be co-related and queried/viewed together in kibana. So should we make elk deployment the default output format ?

@ricarkol In 'make' approach you mentioned, so user specify input arguments (eg. features etc.) as input parameters to make ?

ricarkol commented 8 years ago

@nadgowdas that's a great idea. Yes, it will make things easier to query for.

So should we make elk deployment the default output format? --> the default format is whatever looks best when doing stdout (want that to be the easiest example)

In 'make' approach you mentioned, so user specify input arguments (eg. features etc.) as input parameters to make ? --> Yes, and also default to container crawling mode. The most important output of that make is having the two containers running in the background (1. crawler container and 2. elk container), and output a URL to see the data in kibana (running in the elk container). Something like this:

$ make launch-all
Starting crawler container.... Done
Starting elk container.... Done

You can see the collected data at http://1.2.3.4/awesome-kibana-dashboard
$

So, now when you start a container, you automagically get the data in that kibana dashboard at http://1.2.3.4/awesome-kibana-dashboard.

nadgowdas commented 8 years ago

currently make doesn't start a container ? it only run agentless-system-crawler-test ?

ricarkol commented 8 years ago

Currently the all target starts tests. But the all target can be whatever you want really.

On Mon, Aug 1, 2016 at 7:27 PM, nadgowdas notifications@github.com wrote:

currently make doesn't start a container ? it only run agentless-system-crawler-test ?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cloudviz/agentless-system-crawler/issues/66#issuecomment-236739927, or mute the thread https://github.com/notifications/unsubscribe-auth/AK6ydVE9MI9-1NPe8vRpJ2IJmMoBIxJ6ks5qboD2gaJpZM4JMrHJ .

ricarkol commented 8 years ago

If you want to make the all target the one that starts the crawler and the elk container, go for it. I think it makes sense to have a separate test target.

On Mon, Aug 1, 2016 at 10:38 PM, Ricardo Koller ricarkol@gmail.com wrote:

Currently the all target starts tests. But the all target can be whatever you want really.

On Mon, Aug 1, 2016 at 7:27 PM, nadgowdas notifications@github.com wrote:

currently make doesn't start a container ? it only run agentless-system-crawler-test ?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cloudviz/agentless-system-crawler/issues/66#issuecomment-236739927, or mute the thread https://github.com/notifications/unsubscribe-auth/AK6ydVE9MI9-1NPe8vRpJ2IJmMoBIxJ6ks5qboD2gaJpZM4JMrHJ .

ricarkol commented 8 years ago

The only thing missing now is some example(s) dashboards. Maybe: one for listing processes, and one for listing packages?

canturkisci commented 7 years ago

@nadgowdas do you think this is still valid issue for ASC?

nadgowdas commented 7 years ago

we have an automated setup for ELK to push crawler o/p.

When you do make in the crawler, it setup an ELK container configured with logstash listening on http port and crawler sending output to it.

We should close this one.

canturkisci commented 7 years ago

Thanks Shripad.