Open eroznik opened 3 years ago
Hey I have encountered the same jump in memory. It used to be 250mb -> 400mb MAX with roughly the same amount of mapping. ever since 1.11.0 (which raised CPU) and then 1.12.2 - memory is out of wack.
I checked into the community and it was suggested to do the following:
Ambassador since 1.10/1.11 requires considerably more memory than it did before, having to do with more safety checks, validating the config, and drastically speeding up validation time to push the config to envoy. Could you try setting prune_unreachable_routes to true in the Module and see if it helps? This shrinks the size of the envoy config with the caveat that unfortunately you won’t gain any benefit for regex hosts.
spec: config: prune_unreachable_routes: true
I am currently testing this solution out.
I have some members of my team check this out also and it was noticed that everytime there is a new bump, the helm files for configuration has changed (added etc) and might have caused this
I would like to know eventually if this is going to be normal (big pods) or will they acknowledge this as a bug.
To continue on the same issue. After testing prune_unreachable_routes: true
, the problem persists. Would you suggest any other solution?
Note: closed https://github.com/datawire/ambassador/issues/3414 for the time being as the issue is similar.
If you could provide a snapshot of the processes running inside an Ambassador container after the upgrade, it will help us determine whether the memory usage is abnormal or expected.
Generally speaking, I would think that an Ambassador pod would appreciate a ~1gb memory limit to have plenty of breathing room for validating Envoy configs and managing the control plane in memory.
hi @esmet I'm inside the pod itself, where (or what) would it look like?
@esmet something like this? this is a screenshot i found that i made after the upgrade
@dzkaraka @guongle-ssense We solved our issues when we set AMBASSADOR_LEGACY_MODE env variable to TRUE. Then new version behaves the same as old version. But this is not appropriate solution, but it works for us as tmp solution.
BR
@dzkaraka @guongle-ssense We solved our issues when we set AMBASSADOR_LEGACY_MODE env variable to TRUE. Then new version behaves the same as old version. But this is not appropriate solution, but it works for us as tmp solution.
BR
Thanks for the input, but seeing that from the official doc it is not recommended I'm a bit iffy on it, ill keep on poking for perma solution
worse case, at least its been validated that it does help
Thanks for posting that information. From the top output I can't tell if the busyambassador process that has a 1600Mb virtual size also has a high resident size. Try htop?
In general, depending on how much memory that process is using, I can say that Ambassador using more memory is now fairly normal. I would imagine that the previous 512Mb wouldn't be enough.
@esmet we'll try to provide better info through htop, as requested.. may you please provide some "expected estimates" on how much ram/cpu does the "new" version of Ambassador use? On the previous version(or with the legacy mode turned on) we have Ambassador pods with 1CPU & 512mb RAM running just fine with 45 mappings registred.
There's no hard and fast rule unfortunately. Are you using Endpoint routing by any chance?
Here are container and top view of the containers operating for us @esmet
@esmet , in our case we're using the default routing resolver, we don't have any over-rides
@esmet we are also seeing this issue. This is our htop.
Our configuration:
Other configurations: AMBASSADOR_FAST_RECONFIGURE: "true" AMBASSADOR_DRAIN_TIME: 5 AMBASSADOR_AMBEX_SNAPSHOT_COUNT: 0
And this is our container memory usage in last 1 hour
Our emissary ingress pods were serving 76 mappings and mostly needed less than 1GB of memory in the past, but at some point the memory usage increased more and more. Currently it is peaking at ~10GB before a pod is oomkilled. This caused us a lot of trouble. Looks like something is going wrong inside emissary during startup and also when applications are scaled up or down.
Tried out all the suggested config changes, disabled metrics, upgraded the version, downgraded again, but nothing helped. Unfortunately emissary is basically unusable for us. The only working solution so far is to switch to another ingress controller like haproxy or nginx.
Describe the bug On our Ambassador API Gateway deploy we noticed a severe memory use increase(approximately 8x) when we upgraded the Helm chart version from
6.5.10
->6.6.2
(Ambassador version from1.8.1
to1.12.2
). After we reviewed our setup and checked various changelog/docs we found the env configAMBASSADOR_LEGACY_MODE
, after setting its value totrue
, the memory dropped back to "normal/before the upgrade".To Reproduce Steps to reproduce the behavior:
1.12.2
Expected behavior Memory use shouldn't increase if
AMBASSADOR_LEGACY_MODE
remains tofalse
Versions (please complete the following information):
Additional context More information about our setup can be provided as needed, but for starters: