microsoft / MSLab

Azure Stack HCI, Windows 10 and Windows Server rapid lab deployment scripts
MIT License
1.18k stars 285 forks source link

Azure Stack HCI and Kubernetes monitoring fails to enable #481

Closed jaromirk closed 2 years ago

jaromirk commented 2 years ago

helm chart is deployed, but not working...

NAME: azmon-containers-release-1
LAST DEPLOYED: Wed Jan  5 05:19:36 2022
NAMESPACE: default
STATUS: deployed
REVISION: 2
TEST SUITE: None
USER-SUPPLIED VALUES:
omsagent:
  domain: opinsights.azure.com
  env:
    clusterId: /subscriptions/0d2cb74f-ef20-4daf-9888-32c8a4846fcf/resourceGroups/AzSHCI-Cluster-rg/providers/Microsoft.Kubernetes/connectedClusters/demo
    clusterRegion: eastus
  secret:
    key: Zfs8JJiRD+vRPlGDGoNyxBj8izXnUwNoZZ4loB+gz5GwEEq5gjwTOFiShTWHsJvbTBTJsiUu2SGdL7pyovvXZw==
    wsid: ef10cd6b-8512-45ef-8422-9e3cc829348e

COMPUTED VALUES:
Azure:
  Cluster:
    Region: <your_cluster_region>
    ResourceId: <your_cluster_id>
  Extension:
    Name: ""
    ResourceId: ""
  proxySettings:
    httpProxy: ""
    httpsProxy: ""
    isProxyEnabled: false
    noProxy: ""
    proxyCert: ""
omsagent:
  ISTEST: false
  daemonset:
    affinity:
      nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
          - labelSelector: null
            matchExpressions:
            - key: beta.kubernetes.io/os
              operator: In
              values:
              - linux
            - key: type
              operator: NotIn
              values:
              - virtual-kubelet
            - key: beta.kubernetes.io/arch
              operator: In
              values:
              - amd64
  deployment:
    affinity:
      nodeAffinity:
        preferredDuringSchedulingIgnoredDuringExecution:
        - preference:
            matchExpressions:
            - key: storageprofile
              operator: NotIn
              values:
              - managed
          weight: 1
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
          - labelSelector: null
            matchExpressions:
            - key: beta.kubernetes.io/os
              operator: In
              values:
              - linux
            - key: type
              operator: NotIn
              values:
              - virtual-kubelet
            - key: kubernetes.io/role
              operator: NotIn
              values:
              - master
            - key: beta.kubernetes.io/arch
              operator: In
              values:
              - amd64
  domain: opinsights.azure.com
  env:
    clusterId: /subscriptions/0d2cb74f-ef20-4daf-9888-32c8a4846fcf/resourceGroups/AzSHCI-Cluster-rg/providers/Microsoft.Kubernetes/connectedClusters/demo
    clusterName: <your_cluster_name>
    clusterRegion: eastus
  image:
    agentVersion: 1.10.0.1
    dockerProviderVersion: 16.0.0-0
    pullPolicy: IfNotPresent
    repo: mcr.microsoft.com/azuremonitor/containerinsights/ciprod
    tag: ciprod10132021
    tagWindows: win-ciprod10132021
  logsettings:
    custommountpath: ""
    logflushintervalsecs: "15"
    tailbufchunksizemegabytes: "1"
    tailbufmaxsizemegabytes: "1"
  priority: 10
  proxy: <your_proxy_config>
  rbac: true
  resources:
    daemonsetlinux:
      limits:
        cpu: 150m
        memory: 750Mi
      requests:
        cpu: 75m
        memory: 325Mi
    daemonsetlinuxsidecar:
      limits:
        cpu: 500m
        memory: 1Gi
      requests:
        cpu: 75m
        memory: 225Mi
    daemonsetwindows:
      limits:
        cpu: 200m
        memory: 600Mi
    deployment:
      limits:
        cpu: 1
        memory: 1Gi
      requests:
        cpu: 150m
        memory: 250Mi
  secret:
    key: Zfs8JJiRD+vRPlGDGoNyxBj8izXnUwNoZZ4loB+gz5GwEEq5gjwTOFiShTWHsJvbTBTJsiUu2SGdL7pyovvXZw==
    wsid: ef10cd6b-8512-45ef-8422-9e3cc829348e
  sidecarscraping: true
  tolerations:
  - effect: NoSchedule
    operator: Exists
  - effect: NoExecute
    operator: Exists
  - effect: PreferNoSchedule
    operator: Exists

HOOKS:
MANIFEST:
---
# Source: azuremonitor-containers/templates/omsagent-rbac.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: omsagent
  namespace: kube-system
  labels:
    chart: azuremonitor-containers-2.9.0
    release: azmon-containers-release-1
    heritage: Helm
---
# Source: azuremonitor-containers/templates/omsagent-secret.yaml
apiVersion: v1
kind: Secret
metadata:
  name: omsagent-secret
  namespace: kube-system
  labels:
    chart: azuremonitor-containers-2.9.0
    release: azmon-containers-release-1
    heritage: Helm
type: Opaque
data:
  WSID: "ZWYxMGNkNmItODUxMi00NWVmLTg0MjItOWUzY2M4MjkzNDhl"
  KEY: "WmZzOEpKaVJEK3ZSUGxHREdvTnl4Qmo4aXpYblV3Tm9aWjRsb0IrZ3o1R3dFRXE1Z2p3VE9GaVNoVFdIc0p2YlRCVEpzaVV1MlNHZEw3cHlvdnZYWnc9PQ=="
  DOMAIN: "b3BpbnNpZ2h0cy5henVyZS5jb20="
---
# Source: azuremonitor-containers/templates/omsagent-rs-configmap.yaml
kind: ConfigMap
apiVersion: v1
data:
  kube.conf: |
    # Fluentd config file for OMS Docker - cluster components (kubeAPI)
    #fluent forward plugin
     <source>
      type forward
      port "#{ENV['HEALTHMODEL_REPLICASET_SERVICE_SERVICE_PORT']}"
      bind 0.0.0.0
      chunk_size_limit 4m
     </source>

     #Kubernetes pod inventory
     <source>
      type kubepodinventory
      tag oms.containerinsights.KubePodInventory
      run_interval 60
      log_level debug
     </source>

     #Kubernetes Persistent Volume inventory
     <source>
      type kubepvinventory
      tag oms.containerinsights.KubePVInventory
      run_interval 60
      log_level debug
     </source>

     #Kubernetes events
     <source>
      type kubeevents
      tag oms.containerinsights.KubeEvents
      run_interval 60
      log_level debug
      </source>

     #Kubernetes Nodes
     <source>
      type kubenodeinventory
      tag oms.containerinsights.KubeNodeInventory
      run_interval 60
      log_level debug
     </source>

     #Kubernetes health
     <source>
      type kubehealth
      tag kubehealth.ReplicaSet
      run_interval 60
      log_level debug
     </source>

     #cadvisor perf- Windows nodes
     <source>
      type wincadvisorperf
      tag oms.api.wincadvisorperf
      run_interval 60
      log_level debug
     </source>

     #Kubernetes object state - deployments
     <source>
      type kubestatedeployments
      tag oms.containerinsights.KubeStateDeployments
      run_interval 60
      log_level debug
     </source>

     #Kubernetes object state - HPA
     <source>
      type kubestatehpa
      tag oms.containerinsights.KubeStateHpa
      run_interval 60
      log_level debug
     </source>
     <filter mdm.kubenodeinventory**>
      type filter_inventory2mdm
      log_level info
     </filter>

     # custom_metrics_mdm filter plugin for perf data from windows nodes
     <filter mdm.cadvisorperf**>
      type filter_cadvisor2mdm
      metrics_to_collect cpuUsageNanoCores,memoryWorkingSetBytes
      log_level info
     </filter>

     #health model aggregation filter
     <filter kubehealth**>
      type filter_health_model_builder
     </filter>

     <match oms.containerinsights.KubePodInventory**>
      type out_oms
      log_level debug
      num_threads 2
      buffer_chunk_limit 4m
      buffer_type file
      buffer_path %STATE_DIR_WS%/out_oms_kubepods*.buffer
      buffer_queue_limit 20
      buffer_queue_full_action drop_oldest_chunk
      flush_interval 20s
      retry_limit 10
      retry_wait 5s
      max_retry_wait 5m
     </match>

     <match oms.containerinsights.KubePVInventory**>
      type out_oms
      log_level debug
      num_threads 5
      buffer_chunk_limit 4m
      buffer_type file
      buffer_path %STATE_DIR_WS%/state/out_oms_kubepv*.buffer
      buffer_queue_limit 20
      buffer_queue_full_action drop_oldest_chunk
      flush_interval 20s
      retry_limit 10
      retry_wait 5s
      max_retry_wait 5m
    </match>

     <match oms.containerinsights.KubeEvents**>
      type out_oms
      log_level debug
      num_threads 2
      buffer_chunk_limit 4m
      buffer_type file
      buffer_path %STATE_DIR_WS%/out_oms_kubeevents*.buffer
      buffer_queue_limit 20
      buffer_queue_full_action drop_oldest_chunk
      flush_interval 20s
      retry_limit 10
      retry_wait 5s
      max_retry_wait 5m
     </match>

     <match oms.containerinsights.KubeServices**>
      type out_oms
      log_level debug
      num_threads 2
      buffer_chunk_limit 4m
      buffer_type file
      buffer_path %STATE_DIR_WS%/out_oms_kubeservices*.buffer
      buffer_queue_limit 20
      buffer_queue_full_action drop_oldest_chunk
      flush_interval 20s
      retry_limit 10
      retry_wait 5s
      max_retry_wait 5m
     </match>

     <match oms.containerinsights.KubeNodeInventory**>
      type out_oms
      log_level debug
      num_threads 2
      buffer_chunk_limit 4m
      buffer_type file
      buffer_path %STATE_DIR_WS%/state/out_oms_kubenodes*.buffer
      buffer_queue_limit 20
      buffer_queue_full_action drop_oldest_chunk
      flush_interval 20s
      retry_limit 10
      retry_wait 5s
      max_retry_wait 5m
     </match>

     <match oms.api.ContainerNodeInventory**>
      type out_oms
      log_level debug
      num_threads 3
      buffer_chunk_limit 4m
      buffer_type file
      buffer_path %STATE_DIR_WS%/out_oms_containernodeinventory*.buffer
      buffer_queue_limit 20
      flush_interval 20s
      retry_limit 10
      retry_wait 5s
      max_retry_wait 5m
     </match>

     <match oms.api.KubePerf**>
      type out_oms
      log_level debug
      num_threads 2
      buffer_chunk_limit 4m
      buffer_type file
      buffer_path %STATE_DIR_WS%/out_oms_kubeperf*.buffer
      buffer_queue_limit 20
      buffer_queue_full_action drop_oldest_chunk
      flush_interval 20s
      retry_limit 10
      retry_wait 5s
      max_retry_wait 5m
     </match>

     <match mdm.kubepodinventory** mdm.kubenodeinventory** >
      type out_mdm
      log_level debug
      num_threads 5
      buffer_chunk_limit 4m
      buffer_type file
      buffer_path %STATE_DIR_WS%/out_mdm_*.buffer
      buffer_queue_limit 20
      buffer_queue_full_action drop_oldest_chunk
      flush_interval 20s
      retry_limit 10
      retry_wait 30s
      max_retry_wait 9m
      retry_mdm_post_wait_minutes 30
     </match>

     <match oms.api.wincadvisorperf**>
      type out_oms
      log_level debug
      num_threads 5
      buffer_chunk_limit 4m
      buffer_type file
      buffer_path %STATE_DIR_WS%/out_oms_api_wincadvisorperf*.buffer
      buffer_queue_limit 20
      buffer_queue_full_action drop_oldest_chunk
      flush_interval 20s
      retry_limit 10
      retry_wait 5s
      max_retry_wait 5m
     </match>

     <match mdm.cadvisorperf**>
      type out_mdm
      log_level debug
      num_threads 5
      buffer_chunk_limit 4m
      buffer_type file
      buffer_path %STATE_DIR_WS%/out_mdm_cdvisorperf*.buffer
      buffer_queue_limit 20
      buffer_queue_full_action drop_oldest_chunk
      flush_interval 20s
      retry_limit 10
      retry_wait 5s
      max_retry_wait 5m
      retry_mdm_post_wait_minutes 30
     </match>

     <match kubehealth.Signals**>
      type out_oms
      log_level debug
      num_threads 5
      buffer_chunk_limit 4m
      buffer_type file
      buffer_path %STATE_DIR_WS%/out_oms_kubehealth*.buffer
      buffer_queue_limit 20
      buffer_queue_full_action drop_oldest_chunk
      flush_interval 20s
      retry_limit 10
      retry_wait 5s
      max_retry_wait 5m
     </match>

     <match oms.api.InsightsMetrics**>
      type out_oms
      log_level debug
      num_threads 5
      buffer_chunk_limit 4m
      buffer_type file
      buffer_path %STATE_DIR_WS%/out_oms_insightsmetrics*.buffer
      buffer_queue_limit 20
      buffer_queue_full_action drop_oldest_chunk
      flush_interval 20s
      retry_limit 10
      retry_wait 5s
      max_retry_wait 5m
    </match>
metadata:
  name: omsagent-rs-config
  namespace: kube-system
  labels:
    chart: azuremonitor-containers-2.9.0
    release: azmon-containers-release-1
    heritage: Helm
---
# Source: azuremonitor-containers/templates/omsagent-crd.yaml
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: healthstates.azmon.container.insights
  namespace: kube-system
spec:
  group: azmon.container.insights
  versions:
  - name: v1
    served: true
    storage: true
    schema:
      openAPIV3Schema:
        type: object
        properties:
          state:
            type: string
  scope: Namespaced
  names:
    plural: healthstates
    kind: HealthState
---
# Source: azuremonitor-containers/templates/omsagent-rbac.yaml
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: omsagent-reader
  labels:
    chart: azuremonitor-containers-2.9.0
    release: azmon-containers-release-1
    heritage: Helm
rules:
- apiGroups: [""]
  resources: ["pods", "events", "nodes", "nodes/stats", "nodes/metrics", "nodes/spec", "nodes/proxy", "namespaces", "services", "persistentvolumes"]
  verbs: ["list", "get", "watch"]
- apiGroups: ["apps", "extensions", "autoscaling"]
  resources: ["replicasets", "deployments", "horizontalpodautoscalers"]
  verbs: ["list"]
- apiGroups: ["azmon.container.insights"]
  resources: ["healthstates"]
  verbs: ["get", "create", "patch"]
- apiGroups: ["clusterconfig.azure.com"]
  resources: ["azureclusteridentityrequests", "azureclusteridentityrequests/status"]
  resourceNames: ["container-insights-clusteridentityrequest"]
  verbs: ["get", "create", "patch"]
- nonResourceURLs: ["/metrics"]
  verbs: ["get"]
#arc k8s extension model grants access as part of the extension msi
#remove this explicit permission once the extension available in public preview
- apiGroups: [""]
  resources: ["secrets"]
  resourceNames: ["container-insights-clusteridentityrequest-token"]
  verbs: ["get"]
---
# Source: azuremonitor-containers/templates/omsagent-rbac.yaml
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: omsagentclusterrolebinding
  labels:
    chart: azuremonitor-containers-2.9.0
    release: azmon-containers-release-1
    heritage: Helm
subjects:
  - kind: ServiceAccount
    name: omsagent
    namespace: kube-system
roleRef:
  kind: ClusterRole
  name: omsagent-reader
  apiGroup: rbac.authorization.k8s.io
---
# Source: azuremonitor-containers/templates/omsagent-service.yaml
kind: Service
apiVersion: v1
metadata:
  name: healthmodel-replicaset-service
  namespace: kube-system
spec:
  selector:
    rsName: "omsagent-rs"
  ports:
    - protocol: TCP
      port: 25227
      targetPort: in-rs-tcp
---
# Source: azuremonitor-containers/templates/omsagent-daemonset-windows.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
 name: omsagent-win
 namespace: kube-system
 labels:
   chart: azuremonitor-containers-2.9.0
   release: azmon-containers-release-1
   heritage: Helm
   component: oms-agent-win
   tier: node-win
spec:
 updateStrategy:
  type: RollingUpdate
 selector:
   matchLabels:
     dsName: "omsagent-ds"
 template:
  metadata:
   labels:
    dsName: "omsagent-ds"
   annotations:
    agentVersion: win-ciprod10132021
    dockerProviderVersion: 16.0.0-0
    schema-versions: "v1"
    checksum/secret: 1bbde4b81f4df456b3fa2058fc8b00e1cf6b87baf3a1871c0bfb15ac59f050db
    checksum/config: df888ab22fda4ade5df05ea26f22950bf9d469d8a10c8f35d905c3bde4fe6e3f
  spec:
   priorityClassName: omsagent
   dnsConfig:
     options:
       - name: ndots
         value: "3"
   nodeSelector:
      kubernetes.io/os: windows
   serviceAccountName: omsagent
   containers:
     - name: omsagent-win
       image: mcr.microsoft.com/azuremonitor/containerinsights/ciprod:win-ciprod10132021
       imagePullPolicy: IfNotPresent
       resources:
         limits:
           cpu: 200m
           memory: 600Mi
       env:
       - name: AKS_RESOURCE_ID
         value: "/subscriptions/0d2cb74f-ef20-4daf-9888-32c8a4846fcf/resourceGroups/AzSHCI-Cluster-rg/providers/Microsoft.Kubernetes/connectedClusters/demo"
       - name: AKS_REGION
         value: "eastus"
       - name: CONTROLLER_TYPE
         value: "DaemonSet"
       - name: HOSTNAME
         valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
       - name: NODE_IP
         valueFrom:
            fieldRef:
              fieldPath: status.hostIP
       - name: PODNAME
         valueFrom:
            fieldRef:
              fieldPath: metadata.name
       - name: SIDECAR_SCRAPING_ENABLED
         value: "true"
       volumeMounts:
        - mountPath: C:\ProgramData\docker\containers
          name: docker-windows-containers
          readOnly: true
        - mountPath: C:\var #Read + Write access on this for position file
          name: docker-windows-kuberenetes-container-logs
        - mountPath: C:\etc\config\settings
          name: settings-vol-config
          readOnly: true
        - mountPath: C:\etc\omsagent-secret
          name: omsagent-secret
          readOnly: true
       livenessProbe:
          exec:
            command:
              - cmd
              - /c
              - C:\opt\omsagentwindows\scripts\cmd\livenessprobe.exe
              - fluent-bit.exe
              - fluentdwinaks
              - "C:\\etc\\omsagentwindows\\filesystemwatcher.txt"
              - "C:\\etc\\omsagentwindows\\renewcertificate.txt"
          periodSeconds: 60
          initialDelaySeconds: 180
          timeoutSeconds: 15
   tolerations:
        - effect: NoSchedule
          operator: Exists
        - effect: NoExecute
          operator: Exists
        - effect: PreferNoSchedule
          operator: Exists
   volumes:
    - name: docker-windows-kuberenetes-container-logs
      hostPath:
        path: C:\var
    - name: docker-windows-containers
      hostPath:
        path: C:\ProgramData\docker\containers
        type: DirectoryOrCreate
    - name: settings-vol-config
      configMap:
        name: container-azm-ms-agentconfig
        optional: true
    - name: omsagent-secret
      secret:
       secretName: omsagent-secret
    - name: omsagent-adx-secret
      secret:
       secretName: omsagent-adx-secret
       optional: true
---
# Source: azuremonitor-containers/templates/omsagent-daemonset.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
 name: omsagent
 namespace: kube-system
 labels:
   chart: azuremonitor-containers-2.9.0
   release: azmon-containers-release-1
   heritage: Helm
   component: oms-agent
   tier: node
spec:
 updateStrategy:
  type: RollingUpdate
 selector:
   matchLabels:
     dsName: "omsagent-ds"
 template:
  metadata:
   labels:
    dsName: "omsagent-ds"
   annotations:
    agentVersion: ciprod10132021
    dockerProviderVersion: 16.0.0-0
    schema-versions: "v1"
    checksum/secret: 1bbde4b81f4df456b3fa2058fc8b00e1cf6b87baf3a1871c0bfb15ac59f050db
    checksum/config: df888ab22fda4ade5df05ea26f22950bf9d469d8a10c8f35d905c3bde4fe6e3f
    checksum/logsettings: 3dc8a9960bc79b1b25954116e02249e320f0f73c13306ffcfd47171d05d0ba0a
  spec:
   priorityClassName: omsagent
   dnsConfig:
     options:
       - name: ndots
         value: "3"
   serviceAccountName: omsagent
   containers:
     - name: omsagent
       image: mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod10132021
       imagePullPolicy: IfNotPresent
       resources:
         limits:
           cpu: 150m
           memory: 750Mi
         requests:
           cpu: 75m
           memory: 325Mi
       env:
       - name: AKS_RESOURCE_ID
         value: "/subscriptions/0d2cb74f-ef20-4daf-9888-32c8a4846fcf/resourceGroups/AzSHCI-Cluster-rg/providers/Microsoft.Kubernetes/connectedClusters/demo"
       - name: AKS_REGION
         value: "eastus"
       - name: CONTROLLER_TYPE
         value: "DaemonSet"
       - name: NODE_IP
         valueFrom:
            fieldRef:
              fieldPath: status.hostIP
       - name: USER_ASSIGNED_IDENTITY_CLIENT_ID
         value: ""
       - name: FBIT_SERVICE_FLUSH_INTERVAL
         value: "15"
       - name: FBIT_TAIL_BUFFER_CHUNK_SIZE
         value: "1"
       - name: FBIT_TAIL_BUFFER_MAX_SIZE
         value: "1"
       - name: ISTEST
         value: "false"
       securityContext:
         privileged: true
       ports:
       - containerPort: 25225
         protocol: TCP
       - containerPort: 25224
         protocol: UDP
       volumeMounts:
        - mountPath: /hostfs
          name: host-root
          readOnly: true
        - mountPath: /var/run/host
          name: docker-sock
        - mountPath: /var/log
          name: host-log
        - mountPath: /var/lib/docker/containers
          name: containerlog-path
        - mountPath: /etc/kubernetes/host
          name: azure-json-path
        - mountPath: /etc/omsagent-secret
          name: omsagent-secret
          readOnly: true
        - mountPath: /etc/config/settings
          name: settings-vol-config
          readOnly: true
        - mountPath: /etc/config/settings/adx
          name: omsagent-adx-secret
          readOnly: true
       livenessProbe:
        exec:
         command:
         - /bin/bash
         - -c
         - "/opt/livenessprobe.sh"
        initialDelaySeconds: 60
        periodSeconds: 60
        timeoutSeconds: 15
     - name: omsagent-prometheus
       image: mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod10132021
       imagePullPolicy: IfNotPresent
       resources:
         limits:
           cpu: 500m
           memory: 1Gi
         requests:
           cpu: 75m
           memory: 225Mi
       env:
       - name: AKS_RESOURCE_ID
         value: "/subscriptions/0d2cb74f-ef20-4daf-9888-32c8a4846fcf/resourceGroups/AzSHCI-Cluster-rg/providers/Microsoft.Kubernetes/connectedClusters/demo"
       - name: AKS_REGION
         value: "eastus"
       - name: CONTROLLER_TYPE
         value: "DaemonSet"
       - name: CONTAINER_TYPE
         value: "PrometheusSidecar"
       - name: NODE_IP
         valueFrom:
            fieldRef:
              fieldPath: status.hostIP
       - name: ISTEST
         value: "false"
       securityContext:
         privileged: true
       volumeMounts:
         - mountPath: /etc/kubernetes/host
           name: azure-json-path
         - mountPath: /etc/omsagent-secret
           name: omsagent-secret
           readOnly: true
         - mountPath: /etc/config/settings
           name: settings-vol-config
           readOnly: true
         - mountPath: /etc/config/osm-settings
           name: osm-settings-vol-config
           readOnly: true
       livenessProbe:
         exec:
           command:
             - /bin/bash
             - -c
             - /opt/livenessprobe.sh
         initialDelaySeconds: 60
         periodSeconds: 60
         timeoutSeconds: 15
   affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - labelSelector: null
              matchExpressions:
              - key: beta.kubernetes.io/os
                operator: In
                values:
                - linux
              - key: type
                operator: NotIn
                values:
                - virtual-kubelet
              - key: beta.kubernetes.io/arch
                operator: In
                values:
                - amd64
   tolerations:
        - effect: NoSchedule
          operator: Exists
        - effect: NoExecute
          operator: Exists
        - effect: PreferNoSchedule
          operator: Exists
   volumes:
    - name: host-root
      hostPath:
       path: /
    - name: docker-sock
      hostPath:
       path: /var/run
    - name: container-hostname
      hostPath:
       path: /etc/hostname
    - name: host-log
      hostPath:
       path: /var/log
    - name: containerlog-path
      hostPath:
       path: /var/lib/docker/containers
    - name: azure-json-path
      hostPath:
       path: /etc/kubernetes
    - name: omsagent-secret
      secret:
       secretName: omsagent-secret
    - name: settings-vol-config
      configMap:
        name: container-azm-ms-agentconfig
        optional: true
    - name: omsagent-adx-secret
      secret:
       secretName: omsagent-adx-secret
       optional: true
    - name: osm-settings-vol-config
      configMap:
        name: container-azm-ms-osmconfig
        optional: true
---
# Source: azuremonitor-containers/templates/omsagent-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
 name: omsagent-rs
 namespace: kube-system
 labels:
   chart: azuremonitor-containers-2.9.0
   release: azmon-containers-release-1
   heritage: Helm
   component: oms-agent
   tier: node
spec:
 replicas: 1
 selector:
  matchLabels:
   rsName: "omsagent-rs"
 strategy:
  type: RollingUpdate
 template:
  metadata:
   labels:
    rsName: "omsagent-rs"
   annotations:
    agentVersion: ciprod10132021
    dockerProviderVersion: 16.0.0-0
    schema-versions: "v1"
    checksum/secret: 1bbde4b81f4df456b3fa2058fc8b00e1cf6b87baf3a1871c0bfb15ac59f050db
    checksum/config: df888ab22fda4ade5df05ea26f22950bf9d469d8a10c8f35d905c3bde4fe6e3f
    checksum/logsettings: 3dc8a9960bc79b1b25954116e02249e320f0f73c13306ffcfd47171d05d0ba0a
  spec:
   serviceAccountName: omsagent
   containers:
     - name: omsagent
       image: mcr.microsoft.com/azuremonitor/containerinsights/ciprod:ciprod10132021
       imagePullPolicy: IfNotPresent
       resources:
         limits:
           cpu: 1
           memory: 1Gi
         requests:
           cpu: 150m
           memory: 250Mi
       env:
       - name: AKS_RESOURCE_ID
         value: "/subscriptions/0d2cb74f-ef20-4daf-9888-32c8a4846fcf/resourceGroups/AzSHCI-Cluster-rg/providers/Microsoft.Kubernetes/connectedClusters/demo"
       - name: AKS_REGION
         value: "eastus"
       - name: CONTROLLER_TYPE
         value: "ReplicaSet"
       - name: NODE_IP
         valueFrom:
            fieldRef:
              fieldPath: status.hostIP
       - name: USER_ASSIGNED_IDENTITY_CLIENT_ID
         value: ""
       - name: SIDECAR_SCRAPING_ENABLED
         value: "true"
       - name: ISTEST
         value: "false"
       securityContext:
         privileged: true
       ports:
       - containerPort: 25225
         protocol: TCP
       - containerPort: 25224
         protocol: UDP
       - containerPort: 25227
         protocol: TCP
         name: in-rs-tcp
       volumeMounts:
        - mountPath: /var/run/host
          name: docker-sock
        - mountPath: /var/log
          name: host-log
        - mountPath: /var/lib/docker/containers
          name: containerlog-path
        - mountPath: /etc/kubernetes/host
          name: azure-json-path
        - mountPath: /etc/omsagent-secret
          name: omsagent-secret
          readOnly: true
        - mountPath : /etc/config
          name: omsagent-rs-config
        - mountPath: /etc/config/settings
          name: settings-vol-config
          readOnly: true
        - mountPath: /etc/config/settings/adx
          name: omsagent-adx-secret
          readOnly: true
        - mountPath: /etc/config/osm-settings
          name: osm-settings-vol-config
          readOnly: true
       livenessProbe:
        exec:
         command:
         - /bin/bash
         - -c
         - "/opt/livenessprobe.sh"
        initialDelaySeconds: 60
        periodSeconds: 60
        timeoutSeconds: 15
   affinity:
        nodeAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - preference:
              matchExpressions:
              - key: storageprofile
                operator: NotIn
                values:
                - managed
            weight: 1
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - labelSelector: null
              matchExpressions:
              - key: beta.kubernetes.io/os
                operator: In
                values:
                - linux
              - key: type
                operator: NotIn
                values:
                - virtual-kubelet
              - key: kubernetes.io/role
                operator: NotIn
                values:
                - master
              - key: beta.kubernetes.io/arch
                operator: In
                values:
                - amd64
   tolerations:
        - effect: NoSchedule
          operator: Exists
        - effect: NoExecute
          operator: Exists
        - effect: PreferNoSchedule
          operator: Exists
   volumes:
    - name: docker-sock
      hostPath:
       path: /var/run
    - name: container-hostname
      hostPath:
       path: /etc/hostname
    - name: host-log
      hostPath:
       path: /var/log
    - name: containerlog-path
      hostPath:
       path: /var/lib/docker/containers
    - name: azure-json-path
      hostPath:
       path: /etc/kubernetes
    - name: omsagent-secret
      secret:
       secretName: omsagent-secret
    - name: omsagent-rs-config
      configMap:
        name: omsagent-rs-config
    - name: settings-vol-config
      configMap:
        name: container-azm-ms-agentconfig
        optional: true
    - name: omsagent-adx-secret
      secret:
       secretName: omsagent-adx-secret
       optional: true
    - name: osm-settings-vol-config
      configMap:
        name: container-azm-ms-osmconfig
        optional: true
---
# Source: azuremonitor-containers/templates/omsagent-arc-k8s-crd.yaml
#extension model
apiVersion:  clusterconfig.azure.com/v1beta1
kind: AzureClusterIdentityRequest
metadata:
  name: container-insights-clusteridentityrequest
  namespace: azure-arc
spec:
  audience: https://monitoring.azure.com/
---
# Source: azuremonitor-containers/templates/omsagent-priorityclass.yaml
# This pod priority class is used for daemonsets to allow them to have priority
# over pods that can be scheduled elsewhere.  Without a priority class, it is
# possible for a node to fill up with pods before the daemonset pods get to be
# created for the node or get scheduled.  Note that pods are not "daemonset"
# pods - they are just pods created by the daemonset controller but they have
# a specific affinity set during creation to the specific node each pod was
# created to run on (daemonset controller takes care of that)
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: omsagent
  # Priority classes don't have labels :-)
  annotations:
    chart: azuremonitor-containers-2.9.0
    release: azmon-containers-release-1
    heritage: Helm
    component: oms-agent
value: 10
globalDefault: false
description: "This is the daemonset priority class for omsagent"

NOTES:
azmon-containers-release-1 deployment is complete.
opinsights.azure.com is configured Azure Log Analytics Workspace Domain.
Data should start flowing to the Log Analytics workspace shortly.
Proceed to below link to view health and monitoring data of your clusters
- Azure Public Cloud Portal URL : https://aka.ms/azmon-containers
jaromirk commented 2 years ago

Looks like OMS is installed and configured.

kubectl logs omsagent-rs-95d89dc8f-9tnm2 -n kube-system
customResourceId:/subscriptions/0d2cb74f-ef20-4daf-9888-32c8a4846fcf/resourceGroups/AzSHCI-Cluster-rg/providers/Microsoft.Kubernetes/connectedClusters/demo
customRegion:eastus
Making curl request to oms endpint with domain: opinsights.azure.com
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl request to oms endpoint succeeded.
****************Start Config Processing********************
Both stdout & stderr log collection are turned off for namespaces: '*_kube-system_*.log'
****************End Config Processing********************
****************Start Config Processing********************
****************Start NPM Config Processing********************
config::npm::Successfully substituted the NPM placeholders into /etc/opt/microsoft/docker-cimprov/telegraf-rs.conf file for ReplicaSet
****************Start Prometheus Config Processing********************
config::No configmap mounted for prometheus custom config, using defaults
****************End Prometheus Config Processing********************
****************Start MDM Metrics Config Processing********************
****************End MDM Metrics Config Processing********************
****************Start Metric Collection Settings Processing********************
****************End Metric Collection Settings Processing********************
Making wget request to cadvisor endpoint with port 10250
Wget request using port 10250 succeeded. Using 10250
Making curl request to cadvisor endpoint /pods with port 10250 to get the configured container runtime on kubelet
configured container runtime on kubelet is : containerd
set caps for ruby process to read container env from proc
moc-l3z5a8d9wdy
 * Starting periodic command scheduler cron
   ...done.
docker-cimprov 16.0.0.0
DOCKER_CIMPROV_VERSION=16.0.0.0
*** activating oneagent in legacy auth mode ***
setting mdsd workspaceid & key for workspace:ef10cd6b-8512-45ef-8422-9e3cc829348e
azure-mdsd 1.14.2-build.master.284
starting mdsd mode in main container...
setting up cronjob for ci agent log rotation
*** starting fluentd v1 in replicaset
starting fluent-bit and setting telegraf conf file for replicaset
nodename: moc-l3z5a8d9wdy
replacing nodename in telegraf config
checking for listener on tcp #25226 and waiting for 30 secs if not..
File Doesnt Exist. Creating file...
←[1mFluent Bit v1.6.8←[0m
* ←[1m←[93mCopyright (C) 2019-2020 The Fluent Bit Authors←[0m
* ←[1m←[93mCopyright (C) 2015-2018 Treasure Data←[0m
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

Routing container logs thru v2 route...
waitforlisteneronTCPport found listener on port:25226 in 1 secs
2022-01-06T08:47:50Z I! Starting Telegraf 1.18.0
Telegraf 1.18.0 (git: HEAD ac5c7f6a)
td-agent-bit 1.6.8
stopping rsyslog...
 * Stopping enhanced syslogd rsyslogd
   ...done.
getting rsyslog status...
 * rsyslogd is not running
2022-01-06T08:47:53.3453110Z: Onboarding success. Sucessfully registered certificate with OMS Homing service.
Onboarding success
jaromirk commented 2 years ago

Fixed in latest commit. There were some changes I guess. Helm is no longer needed.