I had to increase the memory limit for the Scan API to 1GB to get it working. A limit of 700MB wasn't enough.
The container scan had a memory limit of 300M when it got OOMkilled.
The issue starting with seeing this error in the logs of the Scan API:
2023-08-30T05:01:59Z FTL could not enable scan queue error="unable to create queue segment in /tmp/cnspec-queue/disk-queue: unable to load queue segment in /tmp/cnspec-queue/disk-queue: segment file /tmp/cnspec-queue/disk-queue/0000000000046.dque is corrupted: error reading gob data from file: EOF"
Perhaps the queue file grew too big?
To Reproduce
Steps to reproduce the behavior:
Deploy the operator on a GKE cluster
Start scanning
Note the error
Expected behavior
The scans should run without being killed and a reduced memory limit.
Screenshots or CLI Output
The GCP metrics aren't that helpful:
Perhaps the interval of 60s is too big:
container/memory/used_bytes
... Sampled every 60 seconds.
Describe the bug When running the container scan and the k8s resource scan inside a k8s Cluster, both scans get OOMkilled.
mondoo-client-k8s-scan-now-mcvfw
is a manually triggered k8s resource scan:I had to increase the memory limit for the Scan API to 1GB to get it working. A limit of 700MB wasn't enough.
The container scan had a memory limit of 300M when it got OOMkilled.
The issue starting with seeing this error in the logs of the Scan API:
Perhaps the queue file grew too big?
To Reproduce Steps to reproduce the behavior:
Expected behavior The scans should run without being killed and a reduced memory limit.
Screenshots or CLI Output The GCP metrics aren't that helpful:
Perhaps the interval of 60s is too big:
https://cloud.google.com/monitoring/api/metrics_kubernetes
Desktop (please complete the following information):
latest operator v1.15.2 latest cnspec v8 image