securityclippy / elasticintel

Serverless, low cost, threat intel aggregation for enterprise or personal use, backed by ElasticSearch.
GNU General Public License v3.0
140 stars 24 forks source link

ElasticIntel

Build Status

Serverless, low cost, threat intel aggregation for enterprise or personal use, backed by ElasticSearch.

About

An alternative to expensive threat intel aggregation platforms which ingest the same data feeds you could get for free.

ElasticIntel is designed to provide a central, scalable and easily queryable repository for threat intelligence of all types.

Utilizes amazon services to allow for minimal support needs while maintaining scalability and resilience and performance. (aws lambda, elasticsearch, s3, sns)

Getting started

See the Getting started docs

Disclaimer.

Currently documentation for this project is lacking due to time constraints. This is actively being fixed and should be much more verbose in a few days. Please check back soon if you're not ready to jump in blind :)

Features

Why ElasticIntel

ElasticIntel is the answer to a frustration which arose when evaluating various paid threat intel products and feeds. After reviewing the data from several of these services, I found that 90% of the data they were returning was data from publicly (and freely) available sources, simply aggregated into one place.

Even more frustrating, was the fact that nearly all of them wanted to charge insane amounts for API access to this ame data, which was limited by volume and made it nearly impossible to query the data in any significant volume without paying even more.

Enrichment

Architecture

  1. Feed Scheduler lambda - The feed scheduler lambda runs once an hour, just like a cron job. It downloads the configurations for all feeds, checks their scheduled download times and puts a download job into an sns queue a feed needs to be downloaded

  2. Ingest Feed Lambda - The ingest lambda is triggered by messages arriving to an sns topic. When a message arrives, the ingest lamda reads the message, parses out the information about the intel feed and downloads the feed itself. Once downloaded, the ingest lambda stores a copy of the feed in s3 and then parses out the data in the feed. Once the data is parsed, the ingest lambda puts the data into the intel index in elasticsearch for easy querying.

Feed ingestion

feeds are ingested through the ingestfeed lambda function. this function is passed a event containing a feed dictionary, as well as the ES index where the indicators from the feed will be stored.

This function then reads the feed dictionary, downloads the appropriate data from the feed url, saves that data to an s3 bucket as a timestamped file, parses that data into intel objects and finally indexes the feed data in teh specified ES index

Elasticsearch

It is important to note that intel is not unique. Each feed is queried daily and some intel may appear in a feed across multiple days. This is by designed, to allow a history view of indicators.

However, this may not be your default expected behavior when querying against the data, so it is important to realize that the number of times an indicator shows up may not be indicative of a high threat score.

setup

Requirements note

if pip3 fails on crypto install, make sure libssl-dev is installed (sudo apt install libssl-dev)

Issues

Recommended Reading:

Aws elasticsearch service: http://docs.aws.amazon.com/elasticsearch-service/latest

understanding elasticsearch upgrades

aws elasticsearch service takes a large amount of hassle out of running your own elasticsearch cluster however, it is important to note that because of this abstraction, the variables that need to be managed by the end user are still important decisions