prometheus / alertmanager

Prometheus Alertmanager
https://prometheus.io
Apache License 2.0
6.67k stars 2.16k forks source link

Securing the gossip protocol? #1322

Closed wrouesnel closed 3 years ago

wrouesnel commented 6 years ago

Is there anyway at the moment to add TLS authentication to the gossip protocol (or is it secure as-is?)

I need to comply with a policy that all traffic is encrypted by default within our infrastructure, so ideally there'd be support for specifying TLS server and client certificates on the command line to secure connections. Is this in the roadmap somewhere?

brian-brazil commented 6 years ago

It might happen sometime in the future, but I wouldn't hold your breath.

hoffie commented 6 years ago

I am also interested in this. Would it have to be TLS? It seems like the newly introduced memberlist implementation also supports AES-GCM encryption using the SecretKey Config option. This could be added somewhere in the memberlist initialization code in cluster.go.

Would this be an improvement which would be accepted by the project? If so, I might try to integrate and test this.

brian-brazil commented 6 years ago

I'd prefer not to get into ad-hoc security mechanisms that some of our dependencies happen to support, that could bring with it the same maintenance challenges we're currently trying to avoid with auth elsewhere.

wrouesnel commented 6 years ago

The trouble is that dependency does make securing it non-trivial since it likes to mix up UDP and TCP, so most of the usual tricks like putting stunnel in front of it don't work. It's kind of glaring because Prometheus everywhere else can be secured with TLS, but this can't.

Would you be open to a patch which (at least currently) traded some performance of the gossip protocol for allowing a set of regular TLS parameters? - i.e. a flag which enables encryption and limits the gossip library to using TCP only so regular TLS can be used? I wouldn't think that would constrain the future too much, since pretty much everything uses TLS in some form, and if you stick with gossip then nothing stops expanding the choice later to use whatever mechanism that allows.

brian-brazil commented 6 years ago

I think options for our standard tls_config client options would be okay. That'd have to go in the config file somewhere.

mxinden commented 5 years ago

@wrouesnel https://github.com/prometheus/alertmanager/pull/1763/ adds a design document as well as a prove of concept implementation to secure the gossip traffic. Please have a look. I would be interested in your thoughts.

lilic commented 4 years ago

@simonpasquier @brian-brazil @mxinden I would be interested in tackling this, just curious if the linked proposal doc is up to date? https://github.com/prometheus/alertmanager/blob/master/doc/design/secure-cluster-traffic.md

roidelapluie commented 4 years ago

Hello, AFAIK, we would like to reuse the same TLS code that the node_exporter is using.

mxinden commented 4 years ago

I would be interested in tackling this

Very cool!

just curious if the linked proposal doc is up to date?

It should be. But in the meantime multiple things happened:

  1. I created a proof-of-concept in https://github.com/prometheus/alertmanager/pull/1819.

  2. TLS support landed in node-exporter https://github.com/prometheus/node_exporter/pull/1277. The TLS configuration struct should be reused to configure Alertmanager gossip via TLS.

  3. A new proposal based on the design doc and my proof-of-concept has been created by @hooten in https://github.com/prometheus/alertmanager/pull/2237.

@lilic let me know if this is of some help.

hooten commented 4 years ago

Cool! @sharadgaur and I were waiting for a couple things to happen before continuing #2237:

  1. security audit of node_exporter's tls_config.go code
  2. the code to move to prometheus/common.

Looks like the latter hasn't happened yet. Has the former?

brian-brazil commented 4 years ago

The former happened.

iplay88keys commented 3 years ago

@lilic Hi! I just wanted to see where you are at with this. My team and I are also looking to have all traffic secure by default and are interested in the state of this work.

lilic commented 3 years ago

@iplay88keys I was under the impression @hooten was continuing the work first? IF this is not the case @iplay88keys feel free to take over this issue!

hooten commented 3 years ago

Yes. @sharadgaur and I would like to continue the work we’ve done. Thanks!

iplay88keys commented 3 years ago

@lilic @hooten Thanks for the information! We'll keep an eye on the progress, then!

hooten commented 3 years ago

I'd appreciate a review of https://github.com/prometheus/alertmanager/pull/2237!