riemann / riemann

A network event stream processing system, in Clojure.
https://riemann.io
Eclipse Public License 1.0
4.23k stars 515 forks source link

Store the events for riemann in an external database #1003

Open aratik711 opened 2 years ago

aratik711 commented 2 years ago

Is your feature request related to a problem? Please describe. Currently there is no way to scale riemann, wanted to know if we can store the events in a separate DB/Cache and have multiple riemann use it making riemann scalable.

Describe the solution you'd like Use of a separate DB to store events/ttl etc before they are sent to their destination(ex: influx)

Describe alternatives you've considered I haven't figured out any alternatives yet. Suggestions are welcome.

jarpy commented 2 years ago

You might want to try the Riemann Users mailing list to have a conversation about architectural patterns that could help you achieve your goals.

My team, for example, uses Logstash as a routing and queuing layer in front of Riemann. It is configured to send most events to Elasticsearch for storage, and also sends some of them to Riemann. If we needed to, we could use the routing layer to route subsets of events to multiple Riemann instances. Riemann itself is inherently not a distributed application, doing everything in memory. That makes it really fast, but leaves distributed architecture decisions in the hands of the operator.

sanel commented 2 years ago

AFAIK you can't scale Riemann this way, because there are two things to store:

1) index database, which you might or might not use. This is just a hasmap of internal metrics, before they are expired. This can be sourced relatively easily to external storage, like Redis. 2) core states through function calls. I don't think this can be easily put somewhere else.

I think, the only "proper" was for scaling Riemann is to use federation, something like Prometheus does [1] and @jarpy mentioned: have one Riemann that accepts all metrics and pass them down to another Riemann instances that will do specific logic, calculations or storing things in a database. Image:

                                +--------> riemann #2
            +------------+      |
  metric -> | riemann #1 | -----+
            +------------+      |
                                +--------> riemann #3

code:

(stream
  (where (metric #"^cpu")
    (forward riemann-2))

  (where (metric #"^disk")
    (forward riemann-3)))

Now, you could scale riemann #1 this way by adding multiple nodes behind of e.g. HAProxy, as long as you just forward events around. Also, if you happen to lose riemann #2, you might not get "cpu" events, but you'll get "disk" events. Not ideal, but better than a single instance.

[1] https://prometheus.io/docs/prometheus/latest/federation/

vipinvkmenon commented 2 years ago

Currently, the approach that we have is the one mentioned by @sanel is what we have for pseudo-Multi-Az approach . we have multiple instances of riemann#1 behind an LB