aki0000 / k8s-myplayground

0 stars 0 forks source link

Install New Relic infra for monitoring #20

Open aki0000 opened 3 years ago

aki0000 commented 3 years ago

https://qiita.com/tkyonezu/items/912fb7cfaab2f59276dd

https://linuxtut.com/en/6cf557a496151010e7e8/

aki0000 commented 3 years ago

New Relic agent doesn't support arm archtecture. Then, this ticket is closed

aki0000 commented 3 years ago

https://docs.newrelic.com/jp/docs/integrations/kubernetes-integration/installation/kubernetes-integration-install-configure/

arm64 supports v2.0.0+ https://github.com/kubernetes/kube-state-metrics/issues/1340

aki0000 commented 3 years ago

root@ras01:~# k get pods -A NAMESPACE NAME READY STATUS RESTARTS AGE default newrelic-infra-844j9 0/1 CrashLoopBackOff 3 3m57s default newrelic-infra-cfbr9 0/1 CrashLoopBackOff 3 3m57s default newrelic-infra-zqhqf 0/1 CrashLoopBackOff 3 3m57s kube-system coredns-558bd4d5db-98d5c 1/1 Running 26 37d kube-system coredns-558bd4d5db-f9mn9 1/1 Running 26 37d kube-system etcd-ras01 1/1 Running 35 37d kube-system kube-apiserver-ras01 1/1 Running 37 37d kube-system kube-controller-manager-ras01 1/1 Running 43 37d kube-system kube-flannel-ds-arm64-cmrgt 1/1 Running 13 37d kube-system kube-flannel-ds-arm64-msgg8 1/1 Running 27 37d kube-system kube-flannel-ds-arm64-rvmqf 1/1 Running 11 37d kube-system kube-proxy-9tqks 1/1 Running 11 37d kube-system kube-proxy-tzl2t 1/1 Running 26 37d kube-system kube-proxy-zxwcv 1/1 Running 13 37d kube-system kube-scheduler-ras01 1/1 Running 42 37d kube-system kube-state-metrics-6bcdf495b6-xqgr7 1/1 Running 0 6m50s kube-system metrics-server-7885b9fd6b-9rcph 1/1 Running 11 25d metallb-system controller-64f86798cc-455zl 1/1 Running 0 4d6h metallb-system speaker-9nw24 1/1 Running 11 37d metallb-system speaker-cxvvh 1/1 Running 25 37d metallb-system speaker-vjd5r 1/1 Running 12 37d root@ras01:~# k logs newrelic-infra-844j9 time="2021-05-17T16:44:07Z" level=info msg="runtime configuration" agentUser=root component="New Relic Infrastructure Agent" executablePath= maxProcs=1 pluginDir="[/etc/newrelic-infra/integrations.d /var/db/newrelic-infra/integrations.d]" time="2021-05-17T16:44:07Z" level=info msg="Checking network connectivity..." component=AgentService service=newrelic-infra time="2021-05-17T16:44:07Z" level=warning msg="URL error detected. May be a configuration problem or a network connectivity issue." component=AgentService error="Head \"https://infra-api.newrelic.com\": dial tcp: lookup infra-api.newrelic.com on 10.96.0.10:53: server misbehaving" service=newrelic-infra time="2021-05-17T16:44:07Z" level=warning msg="Collector endpoint not reachable, retrying..." collector_url="https://infra-api.newrelic.com" component=AgentService error="Head \"https://infra-api.newrelic.com\": dial tcp: lookup infra-api.newrelic.com on 10.96.0.10:53: server misbehaving" service=newrelic-infra time="2021-05-17T16:44:08Z" level=warning msg="URL error detected. May be a configuration problem or a network connectivity issue." component=AgentService error="Head \"https://infra-api.newrelic.com\": dial tcp: lookup infra-api.newrelic.com on 10.96.0.10:53: server misbehaving" service=newrelic-infra time="2021-05-17T16:44:08Z" level=warning msg="Collector endpoint not reachable, retrying..." collector_url="https://infra-api.newrelic.com" component=AgentService error="Head \"https://infra-api.newrelic.com\": dial tcp: lookup infra-api.newrelic.com on 10.96.0.10:53: server misbehaving" service=newrelic-infra time="2021-05-17T16:44:10Z" level=warning msg="URL error detected. May be a configuration problem or a network connectivity issue." component=AgentService error="Head \"https://infra-api.newrelic.com\": dial tcp: lookup infra-api.newrelic.com on 10.96.0.10:53: server misbehaving" service=newrelic-infra time="2021-05-17T16:44:10Z" level=warning msg="Collector endpoint not reachable, retrying..." collector_url="https://infra-api.newrelic.com" component=AgentService error="Head \"https://infra-api.newrelic.com\": dial tcp: lookup infra-api.newrelic.com on 10.96.0.10:53: server misbehaving" service=newrelic-infra time="2021-05-17T16:44:11Z" level=warning msg="URL error detected. May be a configuration problem or a network connectivity issue." component=AgentService error="Head \"https://infra-api.newrelic.com\": dial tcp: lookup infra-api.newrelic.com on 10.96.0.10:53: server misbehaving" service=newrelic-infra time="2021-05-17T16:44:11Z" level=warning msg="Collector endpoint not reachable, retrying..." collector_url="https://infra-api.newrelic.com" component=AgentService error="Head \"https://infra-api.newrelic.com\": dial tcp: lookup infra-api.newrelic.

aki0000 commented 3 years ago

Internal pods are communicated with CoreDNS below when do fqdn

root@ras01:~# k exec -it dnsutils -- cat /etc/resolv.conf
nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local
options ndots:5

root@ras01:~# k get service kube-dns -n kube-system
NAME       TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                  AGE
kube-dns   ClusterIP   10.96.0.10   <none>        53/UDP,53/TCP,9153/TCP   38d
aki0000 commented 3 years ago

It can be shown. But, on control plan tab, no data.... nr

nodata

aki0000 commented 3 years ago

DNSPolicy chnaged Default from ClusterFirstWithHostNetwork. But, this is really correct?....

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: newrelic-infra
  namespace: default
  labels:
    app: newrelic-infra
spec:
  selector:
    matchLabels:
      name: newrelic-infra
  updateStrategy:
      type: RollingUpdate
  template:
    metadata:
      labels:
        name: newrelic-infra
    spec:
      serviceAccountName: newrelic
      # hostNetwork: true # This option is a requirement for the Infrastructure Agent to report the proper hostname in New Relic.
      # dnsPolicy: ClusterFirstWithHostNet
      dnsPolicy: Default
      containers:
        - name: newrelic-infra
          image: newrelic/infrastructure-k8s:2.4.0
          securityContext:
            privileged: true
          resources:
            limits:
              memory: 300M
            requests:
              cpu: 100m
              memory: 150M
          volumeMounts:
            - mountPath: /host
              name: host-volume
              readOnly: true
            - mountPath: /var/run/docker.sock
              name: host-docker-socket
            - mountPath: /var/db/newrelic-infra/integrations.d/
              name: nri-default-integration-cfg-volume
            - mountPath: /etc/newrelic-infra/integrations.d/
              name: nri-integration-cfg-volume
          env:
            - name: "CLUSTER_NAME"
              value: "ras_cluster"
            - name: "NRIA_LICENSE_KEY"
              value: "20ffcf06b11372766a366e64c2c669c25c37NRAL"
            - name: "NRIA_VERBOSE"
              value: "0"
           # - name: "KUBE_STATE_METRICS_POD_LABEL" # Enables discovery of the KSM pod via a label. The value of the label needs to be "true".
           #   value: "<YOUR_LABEL>" # Remember to replace this placeholder with the label name of your choice.
           # - name: "KUBE_STATE_METRICS_PORT" # If the KUBE_STATE_METRICS_POD_LABEL is present, it changes the port queried in the pod.
           #   value: "8080"
           # - name: "KUBE_STATE_METRICS_SCHEME" # If the KUBE_STATE_METRICS_POD_LABEL is present, it changes the scheme used to send to request to the pod.
           #   value: "http"
           # - name: "CADVISOR_PORT" # Enable direct connection to cAdvisor by specifying the port. Needed for Kubernetes versions prior to 1.7.6.
           #   value: "4194"
           # - name: "KUBE_STATE_METRICS_URL" # If this value is specified then discovery process for kube-state-metrics endpoint won't be triggered.
           #   value: "http://172.17.0.3:8080" # This is example value. Only HTTP request is accepted.
           # - name: "ETCD_TLS_SECRET_NAME" # Name of the secret containing the cacert, cert and key used for setting the mTLS config for retrieving metrics from ETCD. In case this is set uncomment the secret cluster role and the rolebinding.
           #   value: "newrelic-infra-etcd-tls-secret"
           # - name: "ETCD_TLS_SECRET_NAMESPACE" # Namespace where the the secret specified in ETCD_TLS_SECRET_NAME was created. In case this is set uncomment the secret cluster role and the rolebinding.
           #   value: "default"
           # Note: Usage of API_SERVER_SECURE_PORT has been deprecated in favor of API_SERVER_ENDPOINT_URL.
           # - name: API_SERVER_SECURE_PORT
           #   value: "6443"
           # - name: "SCHEDULER_ENDPOINT_URL"
           #   value: "https://localhost:10259"
           # - name: "ETCD_ENDPOINT_URL"
           #   value: "https://localhost:9979"
           # - name: "CONTROLLER_MANAGER_ENDPOINT_URL"
           #   value: "https://localhost:10257"
           # - name: "API_SERVER_ENDPOINT_URL"
           #   value: "https://localhost:6443"
            - name: "NRIA_DISPLAY_NAME"
              valueFrom:
                fieldRef:
                  apiVersion: "v1"
                  fieldPath: "spec.nodeName"
            - name: "NRK8S_NODE_NAME"
              valueFrom:
                fieldRef:
                  apiVersion: "v1"
                  fieldPath: "spec.nodeName"
            - name: "NRIA_CUSTOM_ATTRIBUTES"
              value: '{"clusterName":"$(CLUSTER_NAME)"}'
            - name: "NRIA_PASSTHROUGH_ENVIRONMENT"
              value: "KUBERNETES_SERVICE_HOST,KUBERNETES_SERVICE_PORT,CLUSTER_NAME,CADVISOR_PORT,NRK8S_NODE_NAME,KUBE_STATE_METRICS_URL,KUBE_STATE_METRICS_POD_LABEL,ETCD_TLS_SECRET_NAME,ETCD_TLS_SECRET_NAMESPACE,API_SERVER_SECURE_PORT,KUBE_STATE_METRICS_SCHEME,KUBE_STATE_METRICS_PORT,SCHEDULER_ENDPOINT_URL,ETCD_ENDPOINT_URL,CONTROLLER_MANAGER_ENDPOINT_URL,API_SERVER_ENDPOINT_URL,DISABLE_KUBE_STATE_METRICS,NETWORK_ROUTE_FILE"
      volumes:
        - name: host-volume
          hostPath:
            path: /
        - name: host-docker-socket
          hostPath:
            path: /var/run/docker.sock
        - name: nri-default-integration-cfg-volume
          configMap:
            name: nri-default-integration-cfg
        - name: nri-integration-cfg-volume
          configMap:
            name: nri-integration-cfg
      tolerations:
        - operator: "Exists"
          effect: "NoSchedule"
        - operator: "Exists"
          effect: "NoExecute"