ribbybibby / ssl_exporter

Exports Prometheus metrics for TLS certificates
Apache License 2.0
520 stars 97 forks source link

Using ssl_exporter with k8s #12

Closed JohanJermey closed 4 years ago

JohanJermey commented 4 years ago

Hi,

Thanks for creating this project! @ribbybibby I want to use the ssl_exporter on K8S and I've the following questions:

  1. I use Prometheus , did I need to install something in addition in the cluster to use the ssl_exporter? I saw that you have provided a docker image but what is the best way to use it on K8S, should I create k8s service ?

  2. How should I define the current target ,I want to check the target of the current cluster which Prometheus is deployed in, what I should put inside the target config ?

Thanks!

ribbybibby commented 4 years ago

Hi @JohanJermey, thanks for the issue.

  1. I haven't ran ssl_exporter in Kubernetes myself so I can't give you a tried and tested method or a quick example but, yes, I think you should probably define a Service and a Deployment.

  2. It depends what you mean by the 'target of the current cluster'. Do you mean the api server certificate?

ribbybibby commented 4 years ago

Here's a Service and Deployment that work for me in minikube:

apiVersion: v1
kind: Service
metadata:
  labels:
    name: ssl-exporter
  name: ssl-exporter
spec:
  ports:
    - name: ssl-exporter
      protocol: TCP
      port: 9219
      targetPort: 9219
  selector:
    app: ssl-exporter
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: ssl-exporter
spec:
  replicas: 1
  selector:
    matchLabels:
      app: ssl-exporter
  template:
    metadata:
      name: ssl-exporter
      labels:
        app: ssl-exporter
    spec:
      initContainers:
        # Install kube ca cert as a root CA
        - name: ca
          image: alpine
          command:
            - sh
            - -c
            - |
              set -e
              apk add --update ca-certificates
              cp /var/run/secrets/kubernetes.io/serviceaccount/ca.crt /usr/local/share/ca-certificates/kube-ca.crt
              update-ca-certificates
              cp /etc/ssl/certs/* /ssl-certs
          volumeMounts:
            - name: ssl-certs
              mountPath: /ssl-certs
      containers:
        - name: ssl-exporter
          image: ribbybibby/ssl-exporter:v0.6.0
          ports:
            - name: tcp
              containerPort: 9219
          volumeMounts:
            - name: ssl-certs
              mountPath: /etc/ssl/certs
      volumes:
        - name: ssl-certs
          emptyDir: {}

Note the initContainer that adds the api server ca cert to the root certs. This allows you to probe the apiserver in addition to any other target backed by the standard set of root certs. If you're only interested in targeting the api server you could remove this, the volume and the volumeMounts and instead use the arg --tls.cacert=/var/run/secrets/kubernetes.io/serviceaccount/ca.crt.

Assuming you deploy ssl-exporter in the same namespace as Prometheus, this scrape_config should suffice:

scrape_configs:
  - job_name: ssl-exporter
    metrics_path: /probe
    static_configs:
      - targets:
        - kubernetes.default.svc:443
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: ssl-exporter:9219
JohanJermey commented 4 years ago

Hi @ribbybibby

Thank you very much this is awesome! , you was quick and this is really neat!

I deployed successfully your example, few questions/calcification.

image

  1. I see the ssl exporter in the target when I click on the link: http://ssl-exporter:9219/probe?target=kubernetes.default.svc:443 , and nothing showing, any idea what I miss here?

  2. we are using istio and our certificate which we need to monitor is defined like

tls:
   mode: SIMPLE
   privateKey: /etc/istio/os-tls/tls.key
   serverCertificate: /etc/istio/os-tls/tls.crt

How can I configure it to look into it ?

Thanks a lot!

ribbybibby commented 4 years ago
  1. Do you mean that when you click on the link in the target list you're getting a timeout or some other error? Unless you can resolve ssl-exporter and route to the service IP in the cluister from your local machine or wherever you're viewing this page, then I wouldn't expect it to work tbh. The fact that the state is UP tells me its reachable from the prometheus server, which is all that really matters.
  2. I've never used istio, so you're going to have to give me more information. But essentially you need to find the address or addresses that will use that certificate and define them as a target in your prometheus scrape config.
JohanJermey commented 4 years ago
  1. Yes, I try to run it and nothing is shown, do you have some hint how could I make it work with dummy certificate , I want to verify that this is working ... I think your soultion is a lot better , but maybe something like this https://www.robustperception.io/get-alerted-before-your-ssl-certificates-expire but for K8S cluster when you put some dummy certificate and use ssl_exporter to validate it.

  2. Yes we are start using istio lately , basically we are mounting the tls to each pod the needs it, I However as this is new topic on our landscape I'll invistage all the aspects and let you know soon

Thanks you very much @ribbybibby

ribbybibby commented 4 years ago

You can test it by querying for one of the metrics, like ssl_cert_not_before. If you see series in Prometheus, then it's working.

Then, you can achieve what that blog post achieves with an alert rule like:

groups: 
  - name: ssl_expiry.rules 
    rules: 
      - alert: SSLCertExpiringSoon 
        expr: ssl_cert_not_after - time() < 86400 * 30
        for: 10m
ribbybibby commented 4 years ago

Oh, and if you really want to verify it in your browser, you could use port-forwarding with kubectl:

$ kubectl port-forward svc/ssl-exporter 9219
Forwarding from 127.0.0.1:9219 -> 9219
Forwarding from [::1]:9219 -> 9219
Handling connection for 9219

And then curl:

$ curl localhost:9219/probe?target=kubernetes.default.svc:443

Or visit http://localhost:9219/probe?target=kubernetes.default.svc:443 in your browser.

JohanJermey commented 4 years ago

Hi @ribbybibby ,

This is very cool!

when using the exporters now we got two entries back like the following format :

ssl_cert_not_after{issuer_cn="DigiCert Global Root CA",serial_no="xxx"} 1.6733768e+09
ssl_cert_not_after{issuer_cn="DigiCert SHA2 Secure Server CA",serial_no="xxx"} 1.5922616e+09

However we should provide alert only on the second:issuer_cn="DigiCert SHA2 Secure Server CA" , is there a way to filter that we can track only the second and not the first one ?

ribbybibby commented 4 years ago

The reason you are receiving two series, rather than one, is because the exporter also exports the root CA and any intermediates in the chain.

My first suggestion is that you simply don't worry about filtering these out of your alerts. It is highly unlikely that any respectable CA or intermediate would issue a certificate that expires after itself and root CAs are typically renewed far in advance of their expiry, so I think it's unlikely that you will receive any alerts from them.

If you are adamant about selecting only your own SSL certificates, then the most efficient way to do that depends entirely on the information attached to those certificates.

If you've added your organization to your certificates then that's a good one to select on:

(ssl_cert_not_after - time() < 86400 * 30) * on (instance, issuer_cn,serial_no) group_left (subject_ou) ssl_cert_subject_organization_units{subject_ou=",Our Technology Team,"}

Or, if you have a predictable domain name pattern you could use that:

(ssl_cert_not_after - time() < 86400 * 30) * on (instance, issuer_cn,serial_no) group_left (subject_cn) ssl_cert_subject_common_name{subject_cn=~".*.example.io"}
JohanJermey commented 4 years ago

Thank you!

just to verify in case I've the following:

ssl_cert_not_after{issuer_cn="DigiCert Global Root CA",serial_no="xxx"} 1.6733768e+09
ssl_cert_not_after{issuer_cn="DigiCert SHA2 Secure Server CA",serial_no="xxx"} 1.5922616e+09

My query should be like this?


(ssl_cert_not_after - time() < 86400 * 30) * on (instance, issuer_cn,serial_no) group_left (subject_ou) ssl_cert_subject_organization_units{issuer_cn="DigiCert SHA2 Secure Server CA"}```

Am I missing something ?
ribbybibby commented 4 years ago

If you're selecting based on the issuer_cn, then it can be even simpler because issuer_cn is a label on ssl_cert_not_after:

ssl_cert_not_after{issuer_cn="DigiCert SHA2 Secure Server CA"} - time() < 86400 * 30
JohanJermey commented 4 years ago

@ribbybibby Superb!

I've try it and it works, I was able to fire alert about the certificate , however Prometheus doesnt show the right value, for example

image

if you take the value: 8.535492565999985e+06 and put it on

https://www.epochconverter.com/ you will get

GMT: Thursday, January 1, 1970 12:00:08.535 AM
Your time zone: Thursday, January 1, 1970 2:00:08.535 AM GMT+02:00
Relative: 50 years ago

However in the browser when you use http://localhost:9219/probe?target=https://c ....

you will get:

ssl_cert_not_after{issuer_cn="DigiCert SHA2 Secure Server CA",serial_no="xxx"} 1.5922616e+09

which is Monday, June 15, 2020 10:53:20 PM , any idea what is wrong ?

  1. do you know pherhaps if is there is a plugable grafana chart which we can use to show see certificate data , we have already grafana up and running ...

Thank you!

ribbybibby commented 4 years ago
  1. I'm not sure I understand. Is Monday, June 15, 2020 10:53:20 PM not the NotAfter value of the certificate?

  2. Someone's uploaded a dashboard to the Grafana site (https://grafana.com/grafana/dashboards/11279). I've never used it but it looks pretty good from the screenshot. If you try it let me know how you find it. I might ask the creator if I can adapt from it and host it in this repo.

JohanJermey commented 4 years ago

Hi @ribbybibby ,

Sorry , somehow the screenshot wasn't added , please have a look: image

The value 8.535492565999985e+06 is GMT: Thursday, January 1, 1970 12:00:08.535 AM ... is what you see in prometheus , however in the browser (with the ssl_exporter) you see the right value, onday, June 15, 2020 10:53:20 PM` , Any idea? do I miss something as Im not Prometheus expert ...

regard the dashboard, ill try it, it's great the the community is taking the ssl_exporter forward :)

ribbybibby commented 4 years ago

That number (8.535492565999985e+06) is the product of the query ssl_cert_not_after{issuer_cn="DigiCert SHA2 Secure Server CA"} - time() which is the number of seconds between now and the expiry date of your certificate, not a unix timestamp.

The reason you're getting a date of GMT: Thursday, January 1, 1970 12:00:08.535 AM from that site is because the input field is too short and cuts off the 6 from the value, leaving you with 8.535492565999985e+0 which is 8.535492565999985 seconds, or Thursday, January 1, 1970 12:00:08.535 AM when used as a unix timestamp.

JohanJermey commented 4 years ago

Hi @ribbybibby ,

Thanks for clarification ! Btw, I try the grafana and it doesnt work ? (the ssl exporter works and I was able to see the data when I click on the target....) I got the following screen ... did you try it ? does the grafana and Prometheus should be on the same namespce ?

image

I

JohanJermey commented 4 years ago

Thanks for all!