splunk / splunk-connect-for-snmp

Splunk connect for SNMP
https://splunk.github.io/splunk-connect-for-snmp/
Apache License 2.0
34 stars 15 forks source link

ErrImageNeverPull errors #711

Closed sirdroodle closed 4 months ago

sirdroodle commented 1 year ago

My connectivity is OK running on : Linux gns3 5.4.0-1101-gcp #110~18.04.1-Ubuntu SMP Wed Feb 22 08:14:46

I see the following : > microk8s helm3 uninstall snmp -n sc4snmp

Any ideas why? - thanks.

NAME READY STATUS RESTARTS AGE snmp-splunk-connect-for-snmp-worker-poller-7ff7c755cb-t5r7r 0/1 Pending 0 32s snmp-splunk-connect-for-snmp-worker-poller-7ff7c755cb-xkxqq 0/1 ContainerCreating 0 32s snmp-splunk-connect-for-snmp-worker-poller-7ff7c755cb-zk25q 0/1 Pending 0 32s snmp-splunk-connect-for-snmp-worker-poller-7ff7c755cb-7kwhq 0/1 Pending 0 32s snmp-splunk-connect-for-snmp-worker-poller-7ff7c755cb-2zv9j 0/1 Pending 0 31s snmp-splunk-connect-for-snmp-worker-sender-7d575cbdfb-gm9p2 0/1 ErrImageNeverPull 0 32s snmp-splunk-connect-for-snmp-worker-trap-77ff968c5b-xv7bl 0/1 ErrImageNeverPull 0 32s snmp-splunk-connect-for-snmp-inventory-wlhqb 0/1 ErrImageNeverPull 0 32s snmp-splunk-connect-for-snmp-scheduler-8696c4b9cf-fwps2 0/1 ErrImageNeverPull 0 32s snmp-mibserver-847c5bf574-7rhsv 1/1 Running 0 32s snmp-splunk-connect-for-snmp-trap-7467949778-ntblm 0/1 ErrImageNeverPull 0 31s snmp-splunk-connect-for-snmp-trap-7467949778-drrhv 0/1 ErrImageNeverPull 0 31s snmp-redis-master-0 0/1 Running 0 31s snmp-splunk-connect-for-snmp-worker-poller-7ff7c755cb-spln7 0/1 ContainerCreating 0 32s snmp-mongodb-6785df979d-6d7d4 2/2 Running 0 32

sirdroodle commented 1 year ago

OK after having a good read of the following i now see some improvement .. https://splunk-usergroups.slack.com/archives/C01K4V86WV7/p1664781893672949 Current status ..

NAME READY STATUS RESTARTS AGE snmp-splunk-connect-for-snmp-worker-poller-54c84f9f78-jc7wf 0/1 Pending 0 18m snmp-splunk-connect-for-snmp-worker-poller-54c84f9f78-c7d8t 0/1 Pending 0 18m snmp-splunk-connect-for-snmp-worker-poller-54c84f9f78-ns4nt 0/1 Pending 0 18m snmp-splunk-connect-for-snmp-worker-poller-54c84f9f78-p2zdg 0/1 Pending 0 18m snmp-splunk-connect-for-snmp-worker-poller-54c84f9f78-tvqp5 0/1 ContainerCreating 0 18m snmp-splunk-connect-for-snmp-worker-trap-79d9bb5dd6-8h7sb 1/1 Running 0 18m snmp-mibserver-847c5bf574-qdpc2 1/1 Running 0 18m snmp-splunk-connect-for-snmp-worker-sender-7b7d9bc858-q7862 1/1 Running 0 18m snmp-splunk-connect-for-snmp-scheduler-5c6567589-fj5mg 1/1 Running 0 18m snmp-redis-master-0 1/1 Running 0 18m snmp-splunk-connect-for-snmp-worker-poller-54c84f9f78-59t9v 0/1 ContainerCreating 0 18m snmp-mongodb-6785df979d-vbqjl 2/2 Running 0 18m snmp-splunk-connect-for-snmp-trap-8568c79546-9vtcn 0/1 CrashLoopBackOff 7 (5m6s ago) 18m snmp-splunk-connect-for-snmp-trap-8568c79546-b8jfh 0/1 CrashLoopBackOff 7 (4m44s ago) 18m

sirdroodle commented 1 year ago

microk8s kubectl get pods -A NAMESPACE NAME READY STATUS RESTARTS AGE kube-system hostpath-provisioner-766849dd9d-6cxk8 1/1 Running 1 (3h24m ago) 26h kube-system calico-kube-controllers-5bf67f476c-9xjfg 1/1 Running 1 (3h24m ago) 26h kube-system coredns-d489fb88-phdkb 1/1 Running 1 (3h24m ago) 26h default sck-splunk-otel-collector-k8s-cluster-receiver-bbb584bd7-b8n5t 1/1 Running 1 (3h24m ago) 25h metallb-system controller-56c4696b5-fg94x 1/1 Running 1 (3h24m ago) 26h metallb-system speaker-8xhqs 1/1 Running 1 (3h24m ago) 26h kube-system calico-node-ml96q 1/1 Running 1 (3h24m ago) 26h kube-system metrics-server-6b6844c455-xpvmd 1/1 Running 1 (3h24m ago) 26h default sck-splunk-otel-collector-agent-b8nh6 1/1 Running 2 (3h22m ago) 25h sc4snmp snmp-splunk-connect-for-snmp-worker-poller-54c84f9f78-jc7wf 0/1 Pending 0 19m sc4snmp snmp-splunk-connect-for-snmp-worker-poller-54c84f9f78-c7d8t 0/1 Pending 0 19m sc4snmp snmp-splunk-connect-for-snmp-worker-poller-54c84f9f78-ns4nt 0/1 Pending 0 19m sc4snmp snmp-splunk-connect-for-snmp-worker-poller-54c84f9f78-p2zdg 0/1 Pending 0 19m sc4snmp snmp-splunk-connect-for-snmp-worker-poller-54c84f9f78-tvqp5 0/1 ContainerCreating 0 19m sc4snmp snmp-splunk-connect-for-snmp-worker-trap-79d9bb5dd6-8h7sb 1/1 Running 0 19m sc4snmp snmp-mibserver-847c5bf574-qdpc2 1/1 Running 0 19m sc4snmp snmp-splunk-connect-for-snmp-worker-sender-7b7d9bc858-q7862 1/1 Running 0 19m sc4snmp snmp-splunk-connect-for-snmp-scheduler-5c6567589-fj5mg 1/1 Running 0 19m sc4snmp snmp-redis-master-0 1/1 Running 0 19m sc4snmp snmp-splunk-connect-for-snmp-worker-poller-54c84f9f78-59t9v 0/1 ContainerCreating 0 19m sc4snmp snmp-mongodb-6785df979d-vbqjl 2/2 Running 0 19m sc4snmp snmp-splunk-connect-for-snmp-trap-8568c79546-9vtcn 0/1 CrashLoopBackOff 8 (47s ago) 19m sc4snmp snmp-splunk-connect-for-snmp-trap-8568c79546-b8jfh 0/1 CrashLoopBackOff 8 (37s ago) 19m

omrozowicz-splunk commented 1 year ago

Hey, looks like there is a problem with trap configuration. Can you run microk8s kubectl logs -f pod/snmp-splunk-connect-for-snmp-trap-8568c79546-9vtcn -n sc4snmp? It should give us a hint. 0/1 Pending may be a sign of not sufficient resources. Verify it with microk8s kubectl describe pod/snmp-splunk-connect-for-snmp-worker-poller-54c84f9f78-jc7wf -n sc4snmp and microk8s kubectl get events -n sc4snmp. Also microk8s kubectl describe nodes can give you an idea of how much resources you have and how much do you need.

sirdroodle commented 1 year ago

OK it looks better traps are now working but still nothing for the snmp poller:

NAME READY STATUS RESTARTS AGE snmp-splunk-connect-for-snmp-worker-poller-76c9f67747-rhb4d 0/1 Pending 0 3d19h snmp-mibserver-847c5bf574-cs48f 1/1 Running 0 3d19h snmp-redis-master-0 1/1 Running 0 3d19h snmp-splunk-connect-for-snmp-worker-poller-76c9f67747-dt65p 0/1 ContainerCreating 0 3d19h snmp-splunk-connect-for-snmp-worker-sender-5f4dc8477c-b4gwb 1/1 Running 9 (3d19h ago) 3d19h snmp-splunk-connect-for-snmp-worker-trap-6c9ff9cbdf-v6htf 1/1 Running 9 (3d19h ago) 3d19h snmp-splunk-connect-for-snmp-worker-trap-6c9ff9cbdf-6d9jd 1/1 Running 11 (3d18h ago) 3d19h snmp-splunk-connect-for-snmp-scheduler-ff5668fd-xqhgx 1/1 Running 13 (3d18h ago) 3d19h snmp-splunk-connect-for-snmp-trap-674bbf977d-cbq65 1/1 Running 15 (3d18h ago) 3d19h snmp-splunk-connect-for-snmp-trap-674bbf977d-7rznh 1/1 Running 15 (3d18h ago) 3d19h snmp-mongodb-75b89b595f-6c6d8 2/2 Running 4 (3d17h ago) 3d19h

sirdroodle commented 1 year ago

microk8s kubectl describe pod/snmp-splunk-connect-for-snmp-worker-poller-76c9f67747-rhb4d -n sc4snmp Name: snmp-splunk-connect-for-snmp-worker-poller-76c9f67747-rhb4d Namespace: sc4snmp Priority: 0 Service Account: snmp-splunk-connect-for-snmp-worker Node: Labels: app.kubernetes.io/instance=snmp app.kubernetes.io/name=splunk-connect-for-snmp-worker-poller pod-template-hash=76c9f67747 Annotations: Status: Pending IP:
IPs: Controlled By: ReplicaSet/snmp-splunk-connect-for-snmp-worker-poller-76c9f67747 Containers: splunk-connect-for-snmp-worker-poller: Image: ghcr.io/splunk/splunk-connect-for-snmp/container:1.8.6 Port: Host Port: Args: celery worker-poller Limits: cpu: 500m Requests: cpu: 250m Environment: CONFIG_PATH: /app/config/config.yaml REDIS_URL: redis://snmp-redis-headless:6379/1 SC4SNMP_VERSION: 1.8.6 CELERY_BROKER_URL: redis://snmp-redis-headless:6379/0 MONGO_URI: mongodb://snmp-mongodb:27017 WALK_RETRY_MAX_INTERVAL: 600 METRICS_INDEXING_ENABLED: false LOG_LEVEL: INFO UDP_CONNECTION_TIMEOUT: 3 PROFILES_RELOAD_DELAY: 60 MIB_SOURCES: http://snmp-mibserver/asn1/@mib@ MIB_INDEX: http://snmp-mibserver/index.csv MIB_STANDARD: http://snmp-mibserver/standard.txt SPLUNK_HEC_SCHEME: https SPLUNK_HEC_HOST: 34.129.187.83 IGNORE_EMPTY_VARBINDS: false SPLUNK_HEC_PORT: 8088 SPLUNK_HEC_INSECURESSL: true SPLUNK_HEC_TOKEN: <set to the key 'hec_token' in secret 'splunk-connect-for-snmp-splunk'> Optional: false WORKER_CONCURRENCY: 4 PREFETCH_COUNT: 1 Mounts: /.pysnmp/ from pysnmp-cache-volume (rw) /app/config from config (ro) /app/secrets/snmpv3/sc4snmp-hlab-sha-aes from sc4snmp-hlab-sha-aes-snmpv3-secrets (ro) /app/secrets/snmpv3/sc4snmp-hlab-sha-des from sc4snmp-hlab-sha-des-snmpv3-secrets (ro) /tmp/ from tmp (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-8z7vd (ro) Conditions: Type Status PodScheduled False Volumes: config: Type: ConfigMap (a volume populated by a ConfigMap) Name: splunk-connect-for-snmp-config Optional: false sc4snmp-hlab-sha-aes-snmpv3-secrets: Type: Secret (a volume populated by a Secret) SecretName: sc4snmp-hlab-sha-aes Optional: false sc4snmp-hlab-sha-des-snmpv3-secrets: Type: Secret (a volume populated by a Secret) SecretName: sc4snmp-hlab-sha-des Optional: false pysnmp-cache-volume: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium:
SizeLimit: tmp: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium:
SizeLimit: kube-api-access-8z7vd: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: DownwardAPI: true QoS Class: Burstable Node-Selectors: Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events:

sirdroodle commented 1 year ago

microk8s kubectl describe nodes

Name: gns3 Roles: Labels: beta.kubernetes.io/arch=amd64 beta.kubernetes.io/os=linux kubernetes.io/arch=amd64 kubernetes.io/hostname=gns3 kubernetes.io/os=linux microk8s.io/cluster=true node.kubernetes.io/microk8s-controlplane=microk8s-controlplane Annotations: node.alpha.kubernetes.io/ttl: 0 projectcalico.org/IPv4Address: 10.192.0.3/32 projectcalico.org/IPv4VXLANTunnelAddr: 10.1.254.128 volumes.kubernetes.io/controller-managed-attach-detach: true CreationTimestamp: Wed, 08 Mar 2023 15:59:11 +1100 Taints: Unschedulable: false Lease: HolderIdentity: gns3 AcquireTime: RenewTime: Tue, 14 Mar 2023 09:30:43 +1100 Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message


NetworkUnavailable False Fri, 10 Mar 2023 13:09:30 +1100 Fri, 10 Mar 2023 13:09:30 +1100 CalicoIsUp Calico is running on this node MemoryPressure False Tue, 14 Mar 2023 09:28:27 +1100 Wed, 08 Mar 2023 15:59:11 +1100 KubeletHasSufficientMemory kubelet has sufficient memory available DiskPressure False Tue, 14 Mar 2023 09:28:27 +1100 Wed, 08 Mar 2023 15:59:11 +1100 KubeletHasNoDiskPressure kubelet has no disk pressure PIDPressure False Tue, 14 Mar 2023 09:28:27 +1100 Wed, 08 Mar 2023 15:59:11 +1100 KubeletHasSufficientPID kubelet has sufficient PID available Ready True Tue, 14 Mar 2023 09:28:27 +1100 Fri, 10 Mar 2023 13:09:25 +1100 KubeletReady kubelet is posting ready status. AppArmor enabled Addresses: InternalIP: 10.192.0.3 Hostname: gns3 Capacity: cpu: 2 ephemeral-storage: 20134592Ki hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 8145336Ki pods: 110 Allocatable: cpu: 2 ephemeral-storage: 19086016Ki hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 8042936Ki pods: 110 System Info: Machine ID: bd666fb222cb3e2f819041b5680fbcfd System UUID: 326b8af9-a56b-9fb0-bb7d-8209e8a11414 Boot ID: e4398d4f-5cd7-4a05-b724-7b404ddaae21 Kernel Version: 5.4.0-1101-gcp OS Image: Ubuntu 18.04.6 LTS Operating System: linux Architecture: amd64 Container Runtime Version: containerd://1.6.8 Kubelet Version: v1.25.6 Kube-Proxy Version: v1.25.6 Non-terminated Pods: (19 in total) Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age


default sck-splunk-otel-collector-k8s-cluster-receiver-bbb584bd7-b8n5t 200m (10%) 200m (10%) 500Mi (6%) 500Mi (6%) 5d17h kube-system hostpath-provisioner-766849dd9d-6cxk8 0 (0%) 0 (0%) 0 (0%) 0 (0%) 5d17h kube-system calico-kube-controllers-5bf67f476c-9xjfg 0 (0%) 0 (0%) 0 (0%) 0 (0%) 5d17h kube-system coredns-d489fb88-phdkb 100m (5%) 0 (0%) 70Mi (0%) 170Mi (2%) 5d17h kube-system calico-node-ml96q 250m (12%) 0 (0%) 0 (0%) 0 (0%) 5d17h metallb-system speaker-8xhqs 0 (0%) 0 (0%) 0 (0%) 0 (0%) 5d17h kube-system metrics-server-6b6844c455-xpvmd 100m (5%) 0 (0%) 200Mi (2%) 0 (0%) 5d17h metallb-system controller-56c4696b5-fg94x 0 (0%) 0 (0%) 0 (0%) 0 (0%) 5d17h sc4snmp snmp-mibserver-847c5bf574-cs48f 100m (5%) 100m (5%) 128Mi (1%) 128Mi (1%) 3d19h sc4snmp snmp-redis-master-0 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3d19h sc4snmp snmp-splunk-connect-for-snmp-worker-poller-76c9f67747-dt65p 250m (12%) 500m (25%) 0 (0%) 0 (0%) 3d19h sc4snmp snmp-splunk-connect-for-snmp-worker-sender-5f4dc8477c-b4gwb 250m (12%) 500m (25%) 0 (0%) 0 (0%) 3d19h sc4snmp snmp-splunk-connect-for-snmp-worker-trap-6c9ff9cbdf-v6htf 250m (12%) 500m (25%) 0 (0%) 0 (0%) 3d19h sc4snmp snmp-splunk-connect-for-snmp-worker-trap-6c9ff9cbdf-6d9jd 250m (12%) 500m (25%) 0 (0%) 0 (0%) 3d19h sc4snmp snmp-splunk-connect-for-snmp-scheduler-ff5668fd-xqhgx 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3d19h sc4snmp snmp-splunk-connect-for-snmp-trap-674bbf977d-cbq65 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3d19h sc4snmp snmp-splunk-connect-for-snmp-trap-674bbf977d-7rznh 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3d19h default sck-splunk-otel-collector-agent-b8nh6 200m (10%) 200m (10%) 500Mi (6%) 500Mi (6%) 5d17h sc4snmp snmp-mongodb-75b89b595f-6c6d8 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3d19h Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) Resource Requests Limits


cpu 1950m (97%) 2500m (125%) memory 1398Mi (17%) 1298Mi (16%) ephemeral-storage 0 (0%) 0 (0%) hugepages-1Gi 0 (0%) 0 (0%) hugepages-2Mi 0 (0%) 0 (0%)

sirdroodle commented 1 year ago

OK looks like it was a resource issue - i didn't think it would matter so much in a test environment just polling one node!

microk8s kubectl get pods -n sc4snmp NAME READY STATUS RESTARTS AGE snmp-splunk-connect-for-snmp-trap-674bbf977d-9xb65 1/1 Running 0 31m snmp-splunk-connect-for-snmp-scheduler-ff5668fd-mv4xh 1/1 Running 0 31m snmp-splunk-connect-for-snmp-worker-poller-5b48f9f54f-j665q 1/1 Running 0 31m snmp-splunk-connect-for-snmp-worker-trap-6c9ff9cbdf-rtx2p 1/1 Running 0 31m snmp-splunk-connect-for-snmp-worker-poller-5b48f9f54f-2mprz 1/1 Running 0 31m snmp-splunk-connect-for-snmp-trap-674bbf977d-jd52k 1/1 Running 0 31m snmp-splunk-connect-for-snmp-worker-trap-6c9ff9cbdf-vqf24 1/1 Running 0 31m snmp-mibserver-847c5bf574-549v7 1/1 Running 0 31m snmp-splunk-connect-for-snmp-worker-sender-5f4dc8477c-b2dvj 1/1 Running 0 31m snmp-redis-master-0 1/1 Running 0 31m snmp-mongodb-75b89b595f-cnxns 2/2 Running 0 31m

sirdroodle commented 1 year ago

still having issues polling device - but i think thats a separate issue!

Retry in 168s: SnmpActionError('An error of SNMP isWalk=True for a host 192.168.122.139 occurred: No SNMP response received before timeout'

omrozowicz-splunk commented 1 year ago

OK looks like it was a resource issue - i didn't think it would matter so much in a test environment just polling one node!

It's by default ready for polling multiple devices, so it is true it consumes a bunch of CPU and RAM. There is an option to tweak the resources limits to install it on smaller environments (https://splunk.github.io/splunk-connect-for-snmp/main/small-environment/).

Retry in 168s: SnmpActionError('An error of SNMP isWalk=True for a host 192.168.122.139 occurred: No SNMP response received before timeout'

This is the communicate it couldn't poll 192.168.122.139 host. Is it available from the machine SC4SNMP is installed? Can you do manual snmpwalk from the command line? Is it straightforward polling or the device is somewhere behind VIP?