allinurl / goaccess

GoAccess is a real-time web log analyzer and interactive viewer that runs in a terminal in *nix systems or through your browser.
https://goaccess.io
MIT License
18.15k stars 1.1k forks source link

Using GoAccess to monitor everything, not just web logs! #845

Open josephbleroy opened 7 years ago

josephbleroy commented 7 years ago

I came across GoAccess a few months back and have been using it along side netdata to monitoring my cloud systems.

I really like the simplicity and dashboard style of GoAccess, as well as the language it was written in and dependency requirements.

I know that GoAccess was developed to monitor web related requests, such as errors, traffic, bandwidth, referrers, geo location, etc. However, I would like to modify the code to serve a different purpose: network and security related events.

Here's a little information on my project:

Here's an example of a sample log for detecting malicious activity being monitored:

#set_separator  ,
#empty_field    (empty)
#unset_field    -
#path   intel
#open   2017-01-23-13-01-54
#fields ts      uid     id.orig_h       id.orig_p       id.resp_h       id.resp_p       seen.indicator  seen.indicator_type     seen.where      seen.node       matched sources fuid    file_mime_type  file_desc
#types  time    string  addr    port    addr    port    string  enum    enum    string  set[enum]       set[string]     string  string  string
1485194513.356126       CVmspB2e68PB5ZiXU5      192.168.1.3     47712   XXX.XXX.XXX.XXX  80      XXX.XXX.XXX.XXX  Intel::ADDR     Conn::IN_RESP   bro     Intel::ADDR     Bad Reputation Domain   -       -       -

There's a bunch of additional logs I'd like to analyze and report on, but I should be able to replicate the steps for additional logs once I figure out how to analyze and visualize the one above.

It'd be really easy to simply analyze the log directory for all files matching *.log and load them into their respective panel on GoAccess. I know it's not that simple, but perhaps it is, hence the reason I bring this issue up.

The panel below shows HTTP Status Codes. It operates almost the same way that I classify security threats on my network. For example, a 4xx status code on GoAccess's default panel would be Malware Hits on my mockup (below). At first, malware is generically classified but once you click to expand it shows the different types of malware classification (class a, class b, class c, etc).

goaccess-threat-indicators

I hope my explanations and intentions are clear and approachable. I'd love to build this into a product for network admins and security professionals to use with their existing IDS and IPS applications.

Thanks for reading!

allinurl commented 7 years ago

​​Thanks for sending this in and the detailed explanation. As you know GoAccess was developed to monitor web related data, and although there have been some requests to have the ability to add custom panels (#515, #190), none of them have requested the need for different types of log. However, I do like the idea of being able to simply parse a log and display the relevant data in a panel while using GoAccess' current plumbing.

With this comes a few challenges, the biggest is how currently the code is structured targeting web requests. For instance, right now it attempts to extract a hard-coded set of metrics (hits, visitors, bw, etc...) for every request. From the screenshot you posted above, metrics shown are still the same as in the current version, but I'm assuming not all logs will need these metrics? so we will need to change this and allow the user to define which metrics or log fields (columns) should show up.

On the other hand, I think the parser is in good shape to extract pretty much any token/field from a log line, however, the user will need to indicate a rule on how to process the panel data, that is, how they should be counted, e.g., sum, avg, perc, etc... and if they are a child of an existing row (as in the case of HTTP status codes, browsers, OS, etc).

Lastly, a few minor changes may be required to the JSON/HTML output in order to output different metrics. I believe the terminal output should cope well with any data.

Though, in terms of log parsing flexibility, I'm still not certain how far it should go considering the amount of changes involved and that there are a variety of tools out there that process multiple logs.

josephbleroy commented 7 years ago

@allinurl Thank you for your thorough response. As the lead developer for It's good to have your input

The only thing that concerns me, as of right now at least, is processing multiple log sources.

The network security monitor has a directory structure as follows for its log files:

/opt/nsm/bro/logs/
├── current 
│   └── files.log
│   └── conn.log
│   └── notice.log
│   └── syslog.log
├── 2017-07-19
│   └── files.13:00:00-14:00:00.log.gz
│   └── conn.13:00:00-14:00:00.log.gz
│   └── notice.13:00:00-14:00:00.log.gz
│   └── syslog.13:00:00-14:00:00.log.gz

So, with that example /opt/nsm/bro/logs/current/files.log captures 1 hour worth of traffic on a specified interface or by reading in a packet capture directly and generates metadata related to files (hash, size, hosts, etc). Once that hour has expired, it archives it in /opt/nsm/bro/logs/2017-07-19 and continues the process for 24 hours.

Since GoAccess was only configured to read one log at a time, such as with Nginx:

goaccess -f /var/log/nginx/mysite.com.log --log-format=COMBINED -o /var/www/mysite.com/html/stats/report.html

We're only analyzing /var/log/nginx/mysite.com.log, whereas what I'm trying to do is analyze multiple files inside /opt/nsm/bro/logs/current/ and eventually process historical data from the archived logs in /opt/nsm/bro/logs/2017-07-19.

However, after taking a glance at your man page, I saw the following info:

MULTIPLE LOG FILES

There are several ways to parse multiple logs with GoAccess. The simplest is to pass multiple log > files to the command line:

# goaccess access.log access.log.1

This would lead me to assume that I could process multiple log files using the following command:

goaccess -f /opt/nsm/bro/current/files.log /opt/nsm/bro/current/conn.log /opt/nsm/bro/current/notice.log /opt/nsm/bro/current/syslog.log --log-format=COMBINED -o /var/www/mysite.com/html/stats/report.html

As well as using zcat to analyze the archived logs in /opt/nsm/bro/logs/2017-07-19/files.13:00:00-14:00:00.log.gz

Anyways, I'll go through and look at the code and see if I can get something working. I'd love to release a basic dashboard for monitoring security events on servers.

Thanks for your help thus far!