microsoft / opengcs

Moved to https://github.com/microsoft/hcsshim/tree/master/internal/guest. If you wish to open PRs/submit issues please do so against https://github.com/microsoft/hcsshim.
MIT License
83 stars 41 forks source link

Add support for GCS reserve memory #364

Closed jterry75 closed 4 years ago

jterry75 commented 4 years ago

Signed-off-by: Justin Terry (VM) juterry@microsoft.com

kevpar commented 4 years ago

Should we default the gcs limit to 0 (no limit) instead? I'm not sure what the limit accomplishes since if it's hit, the whole pod will go down anyways, and is currently hard to troubleshoot in that state. We could leave in the ability to configure a limit via command line still.

mkatri commented 4 years ago

Should we default the gcs limit to 0 (no limit) instead? I'm not sure what the limit accomplishes since if it's hit, the whole pod will go down anyways, and is currently hard to troubleshoot in that state. We could leave in the ability to configure a limit via command line still.

Wouldn't we at least know which container misbehaved? GCS or workload?

jterry75 commented 4 years ago

Should we default the gcs limit to 0 (no limit) instead? I'm not sure what the limit accomplishes since if it's hit, the whole pod will go down anyways, and is currently hard to troubleshoot in that state. We could leave in the ability to configure a limit via command line still.

I think as a best practice we should keep the gcs cgroup and use its limit. I agree that it will crash if it hits this limit but it would be nice to know. There is no reason that we should be seeing 50mb of usage and if we do keep seeing crashes we know this is not the real issue. Without the limit there is no way to know if a gcs is running with 200MB and stealing from workload memory.

jterry75 commented 4 years ago

DO NOT MERGE doing some studies to try and find the right numbers here.

kevpar commented 4 years ago

Closing this since we took another fix for this issue: https://github.com/microsoft/opengcs/pull/372