[receiver/awscontainerinsight] Amazon Linux 2023 container instance does not have the expected cgroup path.

0nihajim commented 5 months ago

Component(s)

receiver/awscontainerinsight

Is your feature request related to a problem? Please describe.

When the awscontainerinsightreceiver is run on an AL2023 ECS container instance, the following errors are logged frequently.

2024-06-21T06:02:46.395Z warn ecsInfo/cgroup.go:121 Failed to get cpu cgroup path for task: {"kind": "receiver", "name": "awscontainerinsightreceiver", "data_type": "metrics", "error": "CGroup Path \"/cgroup/cpu/ecs/AL2023/dc454d1e84c0478fa8ed683523fe7e55\" does not exist"}

2024-06-21T06:02:46.395Z warn ecsInfo/cgroup.go:142 Failed to get memory cgroup path for task: %v {"kind": "receiver", "name": "awscontainerinsightreceiver", "data_type": "metrics", "error": "CGroup Path \"/cgroup/memory/ecs/AL2023/dc454d1e84c0478fa8ed683523fe7e55\" does not exist"}

2024-06-21T06:02:46.300Z warn cadvisor/cadvisor_linux.go:211 Can't get mem or cpu reserved! {"kind": "receiver", "name": "awscontainerinsightreceiver", "data_type": "metrics"}`

Expected Result

awscontainerinsightreceiver regularly acquires CPU and memory reserved for tasks from the cgroup path, which is /sys/fs/cgroup/memory/ecs// and /sys/fs/cgroup/cpu/ecs//. [1]

Then, the awscontainerinsightreceiver divides the result by the CPULimits and MemoryLimits of the instance acquired by cAdvisor. [2]

This calculation result shows the ratio of CPU and memory (instance_cpu_reserved_capacity, instance_memory_reserved_capacity) reserved by tasks on the container instance. [3]

Actual Result

The expected cgroup path does not exist in AL2023 and the values of cpuReserved and memReserved cannot be retrieved and are set to 0. As a result, the values of instance_memory_reserved_capacity, instance_cpu_reserved_capacity sent to CloudWatch metrics are always reported as 0.

Steps to Reproduce

It can be reproduced by simply launching a container instance of the AL2023 ECS-optimized AMI and deploying the ADOT Collector as a daemon on the container instance with the Quick Setup in the following document.

https://docs.aws.amazon.com/ja_jp/AmazonCloudWatch/latest/monitoring/deploy-container-insights-ECS-OTEL.html

Describe the solution you'd like

AL2023 has switches to cgroup v2. [4] The directory structure under /sys/fs/cgroup is completely different from the directory structure in cgroup v1, so code changes are required.

Could you please support AL2023?

AL2023

sh-5.2$ ls /sys/fs/cgroup/
cgroup.controllers      cgroup.procs            cpu.pressure           dev-hugepages.mount  io.cost.model  memory.numa_stat  misc.capacity                  sys-kernel-tracing.mountcgroup.max.depth        cgroup.stat             cpu.stat               dev-mqueue.mount     io.cost.qos    memory.pressure   sys-fs-fuse-connections.mount  system.slicecgroup.max.descendants  cgroup.subtree_control  cpuset.cpus.effective  ecstasks.slice       io.pressure    memory.reclaim    sys-kernel-config.mount        user.slicecgroup.pressure         cgroup.threads          cpuset.mems.effective  init.scope           io.stat        memory.stat       sys-kernel-debug.mount

AL2

sh-4.2$ ls /sys/fs/cgroup/
blkio  cpu  cpu,cpuacct  cpuacct  cpuset  devices  freezer  hugetlb  memory  net_cls  net_cls,net_prio  net_prio  perf_event  pids  systemd

Additional context

reference

[1] https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/awscontainerinsightreceiver/internal/ecsInfo/cgroup.go

func (c *cgroupScanner) refresh() {

    if c.ecsTaskInfoProvider == nil {
        return
    }

    cpuReserved := int64(0)
    memReserved := int64(0)

    for _, task := range c.ecsTaskInfoProvider.getRunningTasksInfo() {
        taskID, err := getTaskCgroupPathFromARN(task.ARN)
        if err != nil {
            c.logger.Warn("Failed to get ecs taskid from arn: ", zap.Error(err))
            continue
        }
        // ignore the one only consume 2 shares which is the default value in cgroup
        if cr := c.getCPUReservedInTask(taskID, c.containerInstanceInfoProvider.GetClusterName()); cr > 2 {
            cpuReserved += cr
        }
        memReserved += c.getMEMReservedInTask(taskID, c.containerInstanceInfoProvider.GetClusterName(), task.Containers)
    }
    c.Lock()
    defer c.Unlock()
    c.memReserved = memReserved
    c.cpuReserved = cpuReserved
}

func (c *cgroupScanner) getCPUReservedInTask(taskID string, clusterName string) int64 {
    cpuPath, err := getCGroupPathForTask(c.mountPoint, "cpu", taskID, clusterName)
    if err != nil {
        c.logger.Warn("Failed to get cpu cgroup path for task: ", zap.Error(err))
        return int64(0)
    }

    // check if hard limit is configured
    if cfsQuota, err := readInt64(cpuPath, "cpu.cfs_quota_us"); err == nil && cfsQuota != -1 {
        if cfsPeriod, err := readInt64(cpuPath, "cpu.cfs_period_us"); err == nil && cfsPeriod > 0 {
            return int64(math.Ceil(float64(1024*cfsQuota) / float64(cfsPeriod)))
        }
    }

    if shares, err := readInt64(cpuPath, "cpu.shares"); err == nil {
        return shares
    }

    return int64(0)
}

func (c *cgroupScanner) getMEMReservedInTask(taskID string, clusterName string, containers []ECSContainer) int64 {
    memPath, err := getCGroupPathForTask(c.mountPoint, "memory", taskID, clusterName)
    if err != nil {
        c.logger.Warn("Failed to get memory cgroup path for task: %v", zap.Error(err))
        return int64(0)
    }

    if memReserved, err := readInt64(memPath, "memory.limit_in_bytes"); err == nil && memReserved != kernelMagicCodeNotSet {
        return memReserved
    }

    // sum the containers' memory if the task's memory limit is not configured
    sum := int64(0)
    for _, container := range containers {
        containerPath := filepath.Join(memPath, container.DockerID)

        // soft limit first
        if softLimit, err := readInt64(containerPath, "memory.soft_limit_in_bytes"); err == nil && softLimit != kernelMagicCodeNotSet {
            sum += softLimit
            continue
        }

        // try hard limit when soft limit is not configured
        if hardLimit, err := readInt64(containerPath, "memory.limit_in_bytes"); err == nil && hardLimit != kernelMagicCodeNotSet {
            sum += hardLimit
        }
    }
    return sum
}

[2] https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/awscontainerinsightreceiver/internal/cadvisor/cadvisor_linux.go#L199C1-L231C2

for _, cadvisormetric := range cadvisormetrics {
        if cadvisormetric.GetMetricType() == ci.TypeInstance {
            metricMap := cadvisormetric.GetFields()
            cpuReserved := c.ecsInfo.GetCPUReserved()
            memReserved := c.ecsInfo.GetMemReserved()
            if cpuReserved == 0 && memReserved == 0 {
                c.logger.Warn("Can't get mem or cpu reserved!")
            }
            cpuLimits, cpuExist := metricMap[ci.MetricName(ci.TypeInstance, ci.CPULimit)]
            memLimits, memExist := metricMap[ci.MetricName(ci.TypeInstance, ci.MemLimit)]

            if !cpuExist && !memExist {
                c.logger.Warn("Can't get mem or cpu limit")
            } else {
                // cgroup standard cpulimits should be cadvisor standard * 1.024
                metricMap[ci.MetricName(ci.TypeInstance, ci.CPUReservedCapacity)] = float64(cpuReserved) / (float64(cpuLimits.(int64)) * 1.024) * 100
                metricMap[ci.MetricName(ci.TypeInstance, ci.MemReservedCapacity)] = float64(memReserved) / float64(memLimits.(int64)) * 100
            }

[3] https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Container-Insights-metrics-ECS.html

[4] https://docs.aws.amazon.com/linux/al2023/ug/cgroupv2.html

AL2 supports cgroupv1, and AL2023 supports cgroupv2. This is notable if running containerized workloads, such as when Using AL2023 based Amazon ECS AMIs to host containerized workloads.

github-actions[bot] commented 5 months ago

Pinging code owners:

receiver/awscontainerinsight: @Aneurysm9 @pxaws

See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions[bot] commented 3 months ago

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

receiver/awscontainerinsight: @Aneurysm9 @pxaws

See Adding Labels via Comments if you do not have permissions to add labels yourself.

atoulme commented 2 months ago

It might be a good idea to open an issue in the ADOT repository as well. @Aneurysm9 @pxaws please review

github-actions[bot] commented 1 day ago

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

receiver/awscontainerinsight: @Aneurysm9 @pxaws

See Adding Labels via Comments if you do not have permissions to add labels yourself.

open-telemetry / opentelemetry-collector-contrib