prometheus-operator / kube-prometheus

Use Prometheus to monitor Kubernetes and applications running on Kubernetes
https://prometheus-operator.dev/
Apache License 2.0
6.46k stars 1.89k forks source link

Adding AWS CNI Metrics #402

Open rupadhya1 opened 4 years ago

rupadhya1 commented 4 years ago

kube-prometheus was installed using the quick start.

kube-prometheus provides an example (examples/eks-cni-example.jsonnet) and EKS-cni-support page refers to the same.

The doc refers to prometheus-serviceMonitorAwsEksCNI.yaml.

I tried to use jsonnet to generate the new yaml but have been unsuccessful. How do I generate the yaml file?

rupadhya1 commented 4 years ago

Here are the instructions I used to compile the examples/eks-cni-example.jsonnet to monitor the AWS VPC CNI Metrics.

Hope this helps.

Install go

wget https://dl.google.com/go/go1.13.7.linux-amd64.tar.gz
tar xvzf go1.13.7.linux-amd64.tar.gz 
sudo mv go /usr/local
export PATH=$PATH:/usr/local/go/bin:~/go/bin 

Run go version to verify go is in the path

go version

go version go1.13.7 linux/amd64

Install the pre-requisites for compiling additional jsonnet files

go get github.com/jsonnet-bundler/jsonnet-bundler/cmd/jb
go get github.com/brancz/gojsontoyaml
go get github.com/google/go-jsonnet/cmd/jsonnet

Get the kube-prometheus repo

git clone git@github.com:coreos/kube-prometheus.git
cd kube-prometheus 
jb install 

This will source the jsonnetfile.json and download the necessary dependencies and create the vendor dir

Copy the eks-cni-example.jsonnet and compile it

cp examples/eks-cni-example.jsonnet .
./build.sh eks-cni-example.jsonnet 

This will delete and recreate all the manifests including the yamls for AWS CNI (Service, ServiceMonitor and the rules )

Side Note: There is an issue with the latest build that is creating all the manifests in manifests directory and you may have to apply the manifests multiple times.

rupadhya1 commented 4 years ago

Here are the yamls for adding the AWS VPC CNI Metrics:

---
apiVersion: v1
kind: Service
metadata:
  labels:
    k8s-app: aws-node
  name: aws-node
  namespace: kube-system
spec:
  clusterIP: None
  ports:
  - name: cni-metrics-port
    port: 61678
    targetPort: 61678
  selector:
    k8s-app: aws-node

---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    k8s-app: eks-cni
  name: awsekscni
  namespace: monitoring
spec:
  endpoints:
  - interval: 30s
    path: /metrics
    port: cni-metrics-port
  jobLabel: k8s-app
  namespaceSelector:
    matchNames:
    - kube-system
  selector:
    matchLabels:
      k8s-app: aws-node

============== Prometheus-rules

  - name: kube-prometheus-eks.rules
    rules:
    - alert: EksAvailableIPs
      annotations:
        message: Instance {{ $labels.instance }} has less than 10 IPs available.
        runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-eksavailableips
      expr: sum by(instance) (awscni_total_ip_addresses) - sum by(instance) (awscni_assigned_ip_addresses)
        < 10
      for: 10m
      labels:
        severity: critical
brancz commented 4 years ago

Do you feel there is anything additional in our setup that should be added or additionally documented? If you could contribute those, then that would be immensely helpful for the rest of the community! :)

anthosz commented 1 year ago

Here are the yamls for adding the AWS VPC CNI Metrics:

---
apiVersion: v1
kind: Service
metadata:
  labels:
    k8s-app: aws-node
  name: aws-node
  namespace: kube-system
spec:
  clusterIP: None
  ports:
  - name: cni-metrics-port
    port: 61678
    targetPort: 61678
  selector:
    k8s-app: aws-node

---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    k8s-app: eks-cni
  name: awsekscni
  namespace: monitoring
spec:
  endpoints:
  - interval: 30s
    path: /metrics
    port: cni-metrics-port
  jobLabel: k8s-app
  namespaceSelector:
    matchNames:
    - kube-system
  selector:
    matchLabels:
      k8s-app: aws-node

Hello,

I just tried to apply it but didn't works.

I applied the chart eks/cni-metrics-helper (0.1.18) (USE_CLOUDWATCH = false) & the mentioned service/servicemonitor but unable to access to these metrics.

codesenju commented 1 year ago

ServiceMonitor also did not work for me.

What worked for me was to patch my aws-node ds like:

kubectl patch daemonset aws-node -n kube-system --patch '
spec:
  template:
    metadata:
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/path: "/metrics"
        prometheus.io/port: "61678"
'

# To Remove
kubectl patch daemonset aws-node -n kube-system --patch '
spec:
  template:
    metadata:
      annotations:
        prometheus.io/scrape: "null"
        prometheus.io/path: "null"
        prometheus.io/port: "null"
'

After applying the patch I could see the aws-node-xxx targets under kubernetes-pods from the prometheus ui.

anthosz commented 1 year ago

ServiceMonitor also did not work for me.

What worked for me was to patch my aws-node ds like:

kubectl patch daemonset aws-node -n kube-system --patch '
spec:
  template:
    metadata:
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/path: "/metrics"
        prometheus.io/port: "61678"
'

# To Remove
kubectl patch daemonset aws-node -n kube-system --patch '
spec:
  template:
    metadata:
      annotations:
        prometheus.io/scrape: "null"
        prometheus.io/path: "null"
        prometheus.io/port: "null"
'

After applying the patch I could see the aws-node-xxx targets under kubernetes-pods from the prometheus ui.

I use terraform so unable to use patch without hack. Like workaround, I added a new job that scrape cni pod and it works like a charm. I don't understand why it's not clearly documented...