Open reschenburgIDBS opened 4 years ago
If it helps I'm not sure, but here are some logs from around the time of the 2nd spike in the picture:
[service:policy-engine] 2020-06-03 10:31:30+0000 [-] [Thread-44] [anchore_engine.services.policy_engine.engine.logs/info()] [INFO] Db merge took 265.92521929740906 sec
[service:policy-engine] 2020-06-03 10:31:34+0000 [-] "10.0.3.109" - - [03/Jun/2020:10:31:33 +0000] "GET /health HTTP/1.1" 200 - "-" "kube-probe/1.15+"
[service:policy-engine] 2020-06-03 10:31:35+0000 [-] "10.0.3.109" - - [03/Jun/2020:10:31:34 +0000] "GET /health HTTP/1.1" 200 - "-" "kube-probe/1.15+"
[service:policy-engine] 2020-06-03 10:31:46+0000 [-] "10.0.3.109" - - [03/Jun/2020:10:31:46 +0000] "GET /health HTTP/1.1" 200 - "-" "kube-probe/1.15+"
[service:policy-engine] 2020-06-03 10:31:46+0000 [-] "10.0.3.109" - - [03/Jun/2020:10:31:46 +0000] "GET /health HTTP/1.1" 200 - "-" "kube-probe/1.15+"
[service:policy-engine] 2020-06-03 10:31:46+0000 [-] [Thread-8] [anchore_engine.services.policy_engine/handle_feed_sync_trigger()] [INFO] Feed Sync task creator activated
[service:policy-engine] 2020-06-03 10:31:46+0000 [-] [Thread-8] [anchore_engine.services.policy_engine/handle_feed_sync_trigger()] [INFO] Feed Sync Trigger done, waiting for next cycle.
[service:policy-engine] 2020-06-03 10:31:47+0000 [-] [Thread-8] [anchore_engine.services.policy_engine/handle_feed_sync_trigger()] [INFO] Feed Sync task creator complete
The actual alert that memory had gone above 7GB occurred at 10:34:20 - after the Feed Sync task creator completed.
There was a significant improvement in memory usage in the 0.6.1 release of Engine. This was a known issue with 0.6.0 that is resolved via an upgrade. Upgrading to 0.6.1 has no db upgrade requirements, so is relatively fast and safe. Or you can upgrade all the way to 0.7.1, which does have a db upgrade step but has other benefits as well.
The issue you're seeing is the policy engine working thru big chunks of data during the feed sync process. The fix in 0.6.1 is to spool that data to the disk to avoid as much in-memory processing as possible to keep memory footprint small. After the upgrade you should see usage in the 800MB range for the feed sync process, instead of 6GB+.
Any updates on this if the upgrade fixed it @reschenburgIDBS ?
Hi @zhill ! I have the same problem with version 1.1.0 deployed with https://github.com/anchore/anchore-charts
When i launch a global scan from Harbor (over a thousand image), this issue seems to be more visible or happens faster : harbor launch 10 simultaneous scans.
Theses pods are sizing with a large memory :
Maybe there is a leak memory issue ?
Request for help!
I think.. may well be a bug in the helm chart.
In short, my policy engine keeps restarting with an OOMKilled error. I don't think it's causing any actual issues, at least not that I've noticed, but its still annoying.
I've set the resource limits and requests for the policy engine container as follows:
deployed versions:
everything else works fine:
cli version
anchore-cli, version 0.7.1
What docker images are you using:
docker.io/anchore/anchore-engine:v0.6.0
Here's a fun picture of what that looks like on a graph:![image](https://user-images.githubusercontent.com/53173152/83628590-a5937700-a590-11ea-966e-5dd26aa5e0da.png)
I suspect I'm missing the an Xmx equivalent setting somewhere - any help would be much appreciated!