Open felipecrs opened 3 years ago
Hi @felipecrs, yes this is very much a feature we have in mind for Sysbox. Not just for memory, but also for CPUs.
The key is to have Sysbox "virtualize" the /proc/meminfo and /proc/cpuinfo resources exposed inside the containers, according to the corresponding cgroup limits assigned to the container. While Sysbox has the underlying infrastructure to do this already, we've not had the work cycles yet to implement this feature.
Also, it's not clear to us if this will be a Sysbox Enterprise only feature, or if it will go into Sysbox Community Edition too. It's one of those things we must carefully think about to create a balance between community benefit vs. sustainable business.
I'm very happy to know that this feature is considered!
And yes, I should have included CPUs in the original story as the rationale and use case applies for both.
FYI, we are hoping to get to get to this feature before the year is over.
Out of curiosity, I found https://github.com/fabiokung/procg. I'm sending here in case it can be useful as a reference for implementation, or something like that.
Out of curiosity, I found https://github.com/fabiokung/procg. I'm sending here in case it can be useful as a reference for implementation, or something like that.
Thanks @felipecrs; I'll take a look as several users are now asking for /proc/cpuinfo
and /proc/meminfo
emulation in Sysbox containers. We were hoping to get to this by last year's end, but looks like it will be closer to first half of this year.
For completeness, in a K8s environment, I was able to achieve this result by using:
Together with CPU Management Policy as Static:
Because for some reason, lxcfs isn't masking CPU by itself (but it works for memory).
lxcfs is easy to setup using this helm chart, but changing the CPU Management Policy configuration in kubelet can be a challenge depending on how your cluster is provisioned.
Not to mention that it would apply such a policy for all pods, without the option to customize.
So, it would still be very nice to have this feature in Sysbox itself. This would streamline the whole process to achieve VM-like containers.
Thanks @felipecrs, that all makes sense. And yes, a Sysbox based approach should provide more flexibility by allowing the user to specify per-pod resources, ideally by honoring the cgroup constraints defined by the user.
I noticed this in the latest release notes:
Is there any chance this is implemented already?
I noticed this in the latest release notes:
- Fix sysbox emulation of /proc and /sys in containers for kernels 6.5+
Hi @felipecrs, no I don't believe so; the fix described above refers to a problem where starting with kernel 6.5+, sysbox's emulation of /proc and /sys inside a container was totally broken due to a change in the kernel. It does not address this current issue unfortunately. Thanks.
Got it. Thank you!
I'm not sure if this is out-of-scope or not, but yet I'm opening this issue for discussing.
When we run:
It shows the total memory of the host machine despite of the
--memory
constraint. The same happens when usingsysbox-runc
. This is intended.The thing is: as Sysbox claims to transform containers to VM-like ones, would it be possible to allow Sysbox to enforce the total memory which the container will recognize just like normal VMs?
By using
--memory
today, if my containers trespass the limit, they gets killed. Ideally, I would like the containers not to recognize the total memory, so they handle the available memory the way they want without being killed by the daemon. In a CI/CD build pool it's a very desired feature, as I can set a given number of resources for each build and I do not have control over what happens in such builds.The same rationale and use case applies for CPUs as well.