A Clojure abstraction for sending structured data to SIEMs
Eclipse Public License 1.0
unseemly |ˌənˈsēmlē|, adj.

(of behavior or actions) not proper or appropriate: an unseemly squabble.

unsiemly |ˌənˈsēmlē|, noun

a library for sending structured data to SIEMs

This project attempts to give you a simple abstraction for sending structured data to a SIEM. It's intended for use by security operations teams, but since most modern SIEMs just look an awful lot like stream processors tools, you can probably use it for a bunch of other stuff.

Currently supports ElasticSearch (with optional support for AWS' hosted Elasticsearch and its proprietary message signing), GCP's StackDriver, [BigQuery][] and [SNS][]. Because we use java.time/JSR310, this project requires JDK 8 or higher.


Start by adding unsiemly to your dependencies; see above for a Clojars badge.

The primary abstraction is a manifold stream. This makes it easy to test your code by giving you a clean separation between your logic and the actual mechanics of getting data into your SIEM.

(require '[unsiemly.core :as u])

There are two entry points: u/->siem! and u/siem-sink!. If you already have a stream and you just want it to point at a SIEM now, u/siem! is what you want. u/siem-sink! will build a new stream for you that you can put stuff into. Both take an opts map.

Generic options

The following options exist regardless of your specific SIEM type:

Reporting to stdout

A simple builtin :stdout SIEM type exists that just prints each message. The following options exist (where stdout is an alias for the unsiemly.stdout namespace):

Reporting to ElasticSearch (including AWS hosted ElasticSearch)

For ElasticSearch, the ::u/siem-type value is :elasticsearch. The indices are automatically partitioned by day, formatted as $yourlogname-yyyy-MM-dd. The following options exist (were es is an alias for the unsiemly.elasticsearch namespace):

Reporting to StackDriver

For GCP StackDriver, the ::u/siem-type value is :stackdriver and no extra options exist. Credentials are automatically taken from the environment as per the GCP SDK.

Reporting to BigQuery

For GCP BigQuery, the ::u/siem-type value is :bigquery and the following extra options exist (where :ub is an alias for the unsiemly.bigquery namespace):

If the project id is unspecified, uses the default project. If the dataset id is unspecified, the ::u/log-name is used. If the table id is unspecified, unsiemly is used.

Reporting to AWS SNS

For AWS SNS, the ::u/siem-type value is :sns and the following extra options exist (where :us is an alias for the unsiemly.sns namespace):

The log name will be sent as the SNS message subject.

Manifold stream 101

To put stuff onto a stream:

(require '[ :as ms])
(ms/put! siem {"hi" "from unsiemly"})

By default, streams won't keep your process running (most of the work is done in daemon threads), so if you have a short-lived process and you just want to put some stuff on the stream and then quit, there's a convenience API that returns a manifold deferred:

(u/process! opts msgs)

Reformatting values

Usually, the data you have won't be in a format that your SIEM can consume.

By default, common data types that can't be appropriately serialized are already handled. For example, SIEMs that consume JSON will have keywords transformed to strings, timestamps are converted to ISO8601, et cetera. As a rule, you can just give unsiemly the data structure you already have and it will probably do something reasonable with it.

If you have additional parsing needs, check out unsiemly.xforms, which has utilities for less obvious transforms. This can be useful if you need a very specific timestamp format, for example. Strings will never be processed further; so converting to the string type you want will always work.

Configuration via the environment

(require '[unsiemly.env :refer [opts-from-env!]])
(def siem (u/siem-sink! (opts-from-env!)))

Environment variable keys match the regular opt name but upper case and with underscores, for example SIEM_TYPE turns into :unsiemly.core/siem-type. Opts specific to a SIEM type are prefixed with the name of that SIEM type, for example ELASTICSEARCH_HOSTS turns into :unsiemly.elasticsearch/hosts. Lists (like ELASTICSEARCH_HOSTS) are comma-delimited. Booleans are just the strings true and false.

Relation to unclogged

This project shares a number of things with unclogged:

However, they are different projects with different goals. This project is an abstraction over SIEMs, consuming mostly-structured data and giving you tools for transforming it into a structure your SIEM can usefully consume. Meanwhile, unclogged only cares about syslog. Syslog strictly consumes strings, not structured messages. unclogged will let you cheat and send in a structured message, but internally it just converts the object to a string. That process can't be configured, and if you're unlucky you'll just see a clojure.lang.LazySeq@deadbeef. Otherwise, you'll get an EDN-ish data structure, which might be fine, but also might be totally different from what your SIEM expects for further processing, alerting, et cetera.

It would make sense for unsiemly to use unclogged to send information to a syslog-speaking SIEM (see issue #2). Neither project replaces the other: they're cousins operating on a different abstraction layer.


Copyright © Latacora

Distributed under the Eclipse Public License either version 1.0 or (at your option) any later version.