Open lmxia opened 2 years ago
Memory watermark changes are shipped to a central system log through Kafka. The control plane is expected to monitor the log for High
memory watermarks, and route new requests away from the affected instances, so most of the time the Critical
watermark will not be triggered. If for some reason memory usage continues to grow, we rely on a few best-effort defenses to keep the system running in a degraded state:
MAX_WORKERS_PER_APP
tuningBut indeed this is a bug in performance isolation. Currently Blueboat does not have a hard per-worker memory limit, so it is possible to trigger the Critical
watermark from a single worker pretty quickly, before the control plane has time to respond.
The solution would be to add per-worker resident set size limit.
Would you be interested to work on this? :)
yes, sure, I woul like to that.
Happy to review your PR!
There are several approaches for implementing per-process RSS limit:
Pro: The limit is accurate. Con: May not play well with a sandboxed environment (seccomp/dropped privileges)
RLIMIT_AS
to limit the address space (VSZ) of each worker process.Pro: Simple and plays well with OS-level sandboxing. Con: Prevents us from enabling V8 virtual memory tricks for optimizing WebAssembly memory accesses.
On the API side, memory limit should be passed to the runtime as a field in Metadata.
What happened:
Sandbox not full isolated.
What you expected to happen:
If
app A
cost massive memory which result in availble memory less thanCritical MemoryWatermark
, then the runtime will tune smr param: "MAX_WORKERS_PER_APP", that's now the logic. But the tune operation will scale in the runningapp B
, the scale in operation is triggered by app A.That sound like not a good isolation behavior, apps interfere each other.