Open nunofernandes opened 11 months ago
Thanks for reaching out regarding this. We have recently restructured our interface ip reporting to avoid a quadratic computation expansion due to golang syscall behavior on Linux dumping the entire routetable with each interface.Addrs()
. This leads to escalating CPU usage for system with many network interfaces.
This does not address this problem, as the order of the interfaces returned are decided by the OS. However, we do want to verify if this still exist in agent version 3.3.1142.0, and if it does, we can evaluate further on a similar fix as your PR.
Hello.. Tried with the latest one
# yum update https://s3.amazonaws.com/ec2-downloads-windows/SSMAgent/latest/linux_amd64/amazon-ssm-agent.rpm
....
Upgraded:
amazon-ssm-agent-3.3.987.0-1.x86_64
# rpm -qi amazon-ssm-agent
Name : amazon-ssm-agent
Version : 3.3.987.0
Release : 1
Architecture: x86_64
Install Date: 2024-10-23T17:24:47 CEST
Group : Amazon/Tools
Size : 127685837
License : Apache License, Version 2.0
Signature : RSA/SHA1, 2024-09-23T12:51:02 CEST, Key ID bc1f495c97dd04ed
Source RPM : amazon-ssm-agent-3.3.987.0-1.src.rpm
Build Date : 2024-09-23T11:56:15 CEST
Build Host : build.amazon.com
Relocations : (not relocatable)
Packager : Amazon.com, Inc. <http://aws.amazon.com>
Vendor : Amazon.com
URL : http://docs.aws.amazon.com/ssm/latest/APIReference/Welcome.html
Summary : Manage EC2 Instances using SSM APIs
Description :
This package provides Amazon SSM Agent for managing EC2 Instances using SSM APIs
That is not the version you said: 3.3.1142.0
. Waiting for that one to land on the RPM repo/url. With the version available, it still happens:
Once that version lands on the repo, I can try it.. Do you know when that version will be available?
The version is deploying through regions now and will reach global sometimes next week, for testing purposes you can receive the latest version here:
$ sudo yum update https://s3.eu-north-1.amazonaws.com/amazon-ssm-eu-north-1/latest/linux_amd64/amazon-ssm-agent.rpm
Last metadata expiration check: 1 day, 19:49:40 ago on Mon Oct 21 19:52:51 2024.
amazon-ssm-agent.rpm 8.9 MB/s | 24 MB 00:02
Dependencies resolved.
==================================================================================================================================
Package Architecture Version Repository Size
==================================================================================================================================
Upgrading:
amazon-ssm-agent x86_64 3.3.1142.0-1 @commandline 24 M
Transaction Summary
==================================================================================================================================
Upgrade 1 Package
Hello,
Just tested that new version and I still get the ip address from docker0:
So, the issue is still there :(
I see, this looks like we need a dedicated way to filter this out if we decide to go there, when this feature was first designed, we did not define the exact interface to return. We will evaluate potential changes and/or documentation to define this feature. One of the first thing that comes to mind is to go for default NI but the ways to capture that would be distinct across the different platforms we support, and since golang library does not have that capability out of the box, we have to implement potentially unstable methods for different OS as they evolve. That would need to be evaluated further before we take it up.
That is why I sent the patch https://github.com/aws/amazon-ssm-agent/pull/555 that would allow the user to exclude certain interfaces that they know aren't meant to be used. Let me know if that is the route forward and if so, I can rebase the patch with the current codebase.
Unfortunately that route is blocked now as we do not filter via interface anymore, the reason for that being golang syscall behavior dumping the entire routetable when looking up the property of a single interface. This means for hosts with large number of interfaces (e.g. high number of containers). The CPU consumption of this behavior becomes quadratic if we loop over and filter interfaces, and it is very important to us that we keep our resource consumption low.
Well.. what about the following scheme (haven't seen the current codebase so, I'm just in suggestion mode here):
Would that work?
Hello,
We have an onprem server (rocky linux 8) with SSM agent (amazon-ssm-agent-3.2.2016.0-1.x86_64).
At AWS Fleet Manager we have that instance registered with the ip address from docker0 (172.17.0.1):
It was working fine until we lost the dhcp for a few hours and now even after restarting the SSM agent, I always get the docker0's IP registered.
If I do an
ifconfig docker0 down; systemctl restart amazon-ssm-agent.service; ifconfig docker0 up
it works (registers the correct ip) but after some time, it gets back to the previous docker0 ip address registered in SSM.I think it's the code at
agent/platform/platform.go
that is sorting the interfaces differently (guessing):What would be the best option here (except rebooting the server)?