apache / rocketmq-operator

Apache RocketMQ Operator
https://rocketmq.apache.org/
Apache License 2.0
308 stars 126 forks source link

`free` is the bad way to compute memory used by broker or nameservers in container. #201

Closed MrLYG closed 5 months ago

MrLYG commented 6 months ago

https://github.com/apache/rocketmq-operator/blob/a8665716c8028741ec106814798e39fe07e6dcaa/images/broker/alpine/runbroker-customize.sh#L57C34-L57C73

I've noticed that the system command free is used to obtain resource information in our Kubernetes setup. This command retrieves the entire system's resource status, rather than the resource limits of individual containers. This behavior leads to a significant issue: if the memory allocated to a pod in Kubernetes is substantially smaller than the host's total memory, the pod can encounter Out-Of-Memory (OOM) errors.

I believe I can contribute to addressing this issue. My approach would involve modifying the relevant part of our system to ensure that the resource queries are container-aware, possibly by accessing cgroup data directly for more accurate readings of container-specific limits.

I am ready to start working on this and plan to submit a pull request (PR) with the proposed changes. Please let me know if there are any specific guidelines or processes I should follow for this contribution.

caigy commented 6 months ago

@MrLYG Welcome! Here is our pr guideline. If you have any questions, pls do not hesitate to ask in community.

For the issue itself, startup scripts in nameserver may detect the memory in containerized way, it would be helpful for you to optimize the one for broker: https://github.com/apache/rocketmq-operator/blob/a8665716c8028741ec106814798e39fe07e6dcaa/images/namesrv/alpine/runserver-customize.sh#L57-L66

Currently, #199 is working on CGroup V2 support.