This PR will add a new method for the fault handlers to obtain the task network config as well as with the default network interface name. We will also now have a dedicated function to obtain the task metadata with the task network configurations set/initialized.
Implementation details
We will be introducing a new function called GetTaskMetadataWithTaskNetworkConfig() which will be responsible for obtaining the task response with the task network configuration of tasks. If the task is in host mode, we try to obtain the default network interface name on the host network namespace by consuming the changes introduced in https://github.com/aws/amazon-ecs-agent/pull/4342.
GetTaskMetadataWithTaskNetworkConfig(): This function takes in a container ID and a NetworkConfigClient object and will obtain the TaskResponse of the corresponding container ID with the TaskNetworkConfig set/initialized. If the task is running on host mode and the OS platform is linux, we will be get the default network interface name on the host namespace via DefaultNetInterfaceName().
Modified getTaskMetadata() to now take in a new boolean parameter called includeTaskNetworkConfig and will only create/add a new TaskNetworkConfig object within the returned TaskResponse object if it's set to true
Within fault handlers, we will now be using GetTaskMetadataWithTaskNetworkConfig() within validateTaskMetadata() to get the task metadata to be used for starting/stopping/checking status of faults
New NetworkConfigClient struct introduced in the ecs-agent/tmds/utils/netconfig/ package. For linux, this struct will store a netlinkwrapper.NetLink object which is used to get the default network interface name on the host.
Testing
Moved TestV4GetTaskMetadataWithTaskNetworkConfig tests into OS/platform specific files and now consuming the new GetTaskMetadataWithTaskNetworkConfig functionality.
New tests cover the changes: yes
Manual testing
Started a host mode task and tried starting/checking status/stopping a BHP:
level=debug time=2024-09-26T20:13:03Z msg="Handling http request" method="POST" from="172.31.25.237:53534"
level=info time=2024-09-26T20:13:03Z msg="Received new request for request type: start network-blackhole-port" request="{\"Protocol\":\"tcp\",\"TrafficType\":\"egress\",\"Port\":1234}" requestType="start network-blackhole-port" tmdsEndpointContainerID="03da5270-90d9-4571-bf16-2dd86d87e9f6"
level=debug time=2024-09-26T20:13:03Z msg="Found route" Route={Ifindex: 2 Dst: <nil> Src: <nil> Gw: 172.31.16.1 Flags: [] Table: 254 Realm: 0}
level=debug time=2024-09-26T20:13:03Z msg="Found the associated network interface by the index" LinkName="eth0" LinkIndex=2
level=info time=2024-09-26T20:13:03Z msg="Obtained the default network interface name on host" taskARN="arn:aws:ecs:us-west-2:113424923516:task/default/12d38370cc4448fca63e3657891d8f7f" defaultDeviceName="eth0"
level=info time=2024-09-26T20:13:03Z msg="[INFO] Black hole port fault is not running" command="iptables -C egress-tcp-1234 -p tcp --dport 1234 -j DROP" output="iptables: Bad rule (does a matching rule exist in that chain?).\n" taskArn="arn:aws:ecs:us-west-2:113424923516:task/default/12d38370cc4448fca63e3657891d8f7f" exitCode=1 netns="host"
level=info time=2024-09-26T20:13:03Z msg="[INFO] Attempting to start network black hole port fault" chain="egress-tcp-1234" taskArn="arn:aws:ecs:us-west-2:113424923516:task/default/12d38370cc4448fca63e3657891d8f7f" netns="host"
level=info time=2024-09-26T20:13:03Z msg="Successfully started fault" requestType="start network-blackhole-port" request="{\"Port\":1234,\"Protocol\":\"tcp\",\"TrafficType\":\"egress\"}" response="{\"Status\":\"running\"}"
level=debug time=2024-09-26T20:13:13Z msg="Handling http request" method="POST" from="172.31.25.237:43658"
level=info time=2024-09-26T20:13:13Z msg="Received new request for request type: check status network-blackhole-port" request="{\"Protocol\":\"tcp\",\"TrafficType\":\"egress\",\"Port\":1234}" requestType="check status network-blackhole-port" tmdsEndpointContainerID="03da5270-90d9-4571-bf16-2dd86d87e9f6"
level=debug time=2024-09-26T20:13:13Z msg="Successfully parsed fault request payload" request="{\"Port\":1234,\"Protocol\":\"tcp\",\"TrafficType\":\"egress\"}"
level=debug time=2024-09-26T20:13:13Z msg="Found route" Route={Ifindex: 2 Dst: <nil> Src: <nil> Gw: 172.31.16.1 Flags: [] Table: 254 Realm: 0}
level=debug time=2024-09-26T20:13:13Z msg="Found the associated network interface by the index" LinkName="eth0" LinkIndex=2
level=info time=2024-09-26T20:13:13Z msg="Obtained the default network interface name on host" taskARN="arn:aws:ecs:us-west-2:113424923516:task/default/12d38370cc4448fca63e3657891d8f7f" defaultDeviceName="eth0"
level=info time=2024-09-26T20:13:13Z msg="[INFO] Black hole port fault has been found running" command="iptables -C egress-tcp-1234 -p tcp --dport 1234 -j DROP" output="" taskArn="arn:aws:ecs:us-west-2:113424923516:task/default/12d38370cc4448fca63e3657891d8f7f" netns="host"
level=info time=2024-09-26T20:13:13Z msg="Successfully check status fault" requestType="check status network-blackhole-port" request="{\"Port\":1234,\"Protocol\":\"tcp\",\"TrafficType\":\"egress\"}" response="{\"Status\":\"running\"}"
level=debug time=2024-09-26T20:13:23Z msg="Handling http request" method="POST" from="172.31.25.237:42884"
level=info time=2024-09-26T20:13:23Z msg="Received new request for request type: stop network-blackhole-port" request="{\"Protocol\":\"tcp\",\"TrafficType\":\"egress\",\"Port\":1234}" requestType="stop network-blackhole-port" tmdsEndpointContainerID="03da5270-90d9-4571-bf16-2dd86d87e9f6"
level=debug time=2024-09-26T20:13:23Z msg="Successfully parsed fault request payload" request="{\"Port\":1234,\"Protocol\":\"tcp\",\"TrafficType\":\"egress\"}"
level=debug time=2024-09-26T20:13:23Z msg="Found route" Route={Ifindex: 2 Dst: <nil> Src: <nil> Gw: 172.31.16.1 Flags: [] Table: 254 Realm: 0}
level=debug time=2024-09-26T20:13:23Z msg="Found the associated network interface by the index" LinkName="eth0" LinkIndex=2
level=info time=2024-09-26T20:13:23Z msg="Obtained the default network interface name on host" taskARN="arn:aws:ecs:us-west-2:113424923516:task/default/12d38370cc4448fca63e3657891d8f7f" defaultDeviceName="eth0"
level=info time=2024-09-26T20:13:23Z msg="[INFO] Black hole port fault has been found running" netns="host" command="iptables -C egress-tcp-1234 -p tcp --dport 1234 -j DROP" output="" taskArn="arn:aws:ecs:us-west-2:113424923516:task/default/12d38370cc4448fca63e3657891d8f7f"
level=info time=2024-09-26T20:13:23Z msg="[INFO] Attempting to stop network black hole port fault" netns="host" chain="egress-tcp-1234" taskArn="arn:aws:ecs:us-west-2:113424923516:task/default/12d38370cc4448fca63e3657891d8f7f"
level=info time=2024-09-26T20:13:23Z msg="Successfully stopped fault" requestType="stop network-blackhole-port" request="{\"Port\":1234,\"Protocol\":\"tcp\",\"TrafficType\":\"egress\"}" response="{\"Status\":\"stopped\"}"
Iptables of the Host mode on task
[ec2-user@ip-172-31-25-237 amazon-ecs-agent]$ sudo iptables -nL
Chain INPUT (policy ACCEPT)
target prot opt source destination
DROP tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:51678
DROP all -- !127.0.0.0/8 127.0.0.0/8 ! ctstate RELATED,ESTABLISHED,DNAT
Chain FORWARD (policy DROP)
target prot opt source destination
DOCKER-USER all -- 0.0.0.0/0 0.0.0.0/0
DOCKER-ISOLATION-STAGE-1 all -- 0.0.0.0/0 0.0.0.0/0
ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 ctstate RELATED,ESTABLISHED
DOCKER all -- 0.0.0.0/0 0.0.0.0/0
ACCEPT all -- 0.0.0.0/0 0.0.0.0/0
ACCEPT all -- 0.0.0.0/0 0.0.0.0/0
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
Chain DOCKER (1 references)
target prot opt source destination
Chain DOCKER-ISOLATION-STAGE-1 (1 references)
target prot opt source destination
DOCKER-ISOLATION-STAGE-2 all -- 0.0.0.0/0 0.0.0.0/0
RETURN all -- 0.0.0.0/0 0.0.0.0/0
Chain DOCKER-ISOLATION-STAGE-2 (1 references)
target prot opt source destination
DROP all -- 0.0.0.0/0 0.0.0.0/0
RETURN all -- 0.0.0.0/0 0.0.0.0/0
Chain DOCKER-USER (1 references)
target prot opt source destination
RETURN all -- 0.0.0.0/0 0.0.0.0/0
[ec2-user@ip-172-31-25-237 amazon-ecs-agent]$ sudo iptables -nL
Chain INPUT (policy ACCEPT)
target prot opt source destination
DROP tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:51678
DROP all -- !127.0.0.0/8 127.0.0.0/8 ! ctstate RELATED,ESTABLISHED,DNAT
Chain FORWARD (policy DROP)
target prot opt source destination
DOCKER-USER all -- 0.0.0.0/0 0.0.0.0/0
DOCKER-ISOLATION-STAGE-1 all -- 0.0.0.0/0 0.0.0.0/0
ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 ctstate RELATED,ESTABLISHED
DOCKER all -- 0.0.0.0/0 0.0.0.0/0
ACCEPT all -- 0.0.0.0/0 0.0.0.0/0
ACCEPT all -- 0.0.0.0/0 0.0.0.0/0
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
egress-tcp-1234 all -- 0.0.0.0/0 0.0.0.0/0
Chain DOCKER (1 references)
target prot opt source destination
Chain DOCKER-ISOLATION-STAGE-1 (1 references)
target prot opt source destination
DOCKER-ISOLATION-STAGE-2 all -- 0.0.0.0/0 0.0.0.0/0
RETURN all -- 0.0.0.0/0 0.0.0.0/0
Chain DOCKER-ISOLATION-STAGE-2 (1 references)
target prot opt source destination
DROP all -- 0.0.0.0/0 0.0.0.0/0
RETURN all -- 0.0.0.0/0 0.0.0.0/0
Chain DOCKER-USER (1 references)
target prot opt source destination
RETURN all -- 0.0.0.0/0 0.0.0.0/0
Chain egress-tcp-1234 (1 references)
target prot opt source destination
DROP tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:1234
[ec2-user@ip-172-31-25-237 amazon-ecs-agent]$ sudo iptables -nL
Chain INPUT (policy ACCEPT)
target prot opt source destination
DROP tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:51678
DROP all -- !127.0.0.0/8 127.0.0.0/8 ! ctstate RELATED,ESTABLISHED,DNAT
Chain FORWARD (policy DROP)
target prot opt source destination
DOCKER-USER all -- 0.0.0.0/0 0.0.0.0/0
DOCKER-ISOLATION-STAGE-1 all -- 0.0.0.0/0 0.0.0.0/0
ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 ctstate RELATED,ESTABLISHED
DOCKER all -- 0.0.0.0/0 0.0.0.0/0
ACCEPT all -- 0.0.0.0/0 0.0.0.0/0
ACCEPT all -- 0.0.0.0/0 0.0.0.0/0
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
Chain DOCKER (1 references)
target prot opt source destination
Chain DOCKER-ISOLATION-STAGE-1 (1 references)
target prot opt source destination
DOCKER-ISOLATION-STAGE-2 all -- 0.0.0.0/0 0.0.0.0/0
RETURN all -- 0.0.0.0/0 0.0.0.0/0
Chain DOCKER-ISOLATION-STAGE-2 (1 references)
target prot opt source destination
DROP all -- 0.0.0.0/0 0.0.0.0/0
RETURN all -- 0.0.0.0/0 0.0.0.0/0
Chain DOCKER-USER (1 references)
target prot opt source destination
RETURN all -- 0.0.0.0/0 0.0.0.0/0
Started a AWSVPC mode task and tried starting/checking status/stopping a BHP:
level=debug time=2024-09-26T20:15:08Z msg="Handling http request" method="POST" from="169.254.172.2:40238"
level=info time=2024-09-26T20:15:08Z msg="Received new request for request type: start network-blackhole-port" request="{\"Protocol\":\"tcp\",\"TrafficType\":\"egress\",\"Port\":1234}" requestType="start network-blackhole-port" tmdsEndpointContainerID="962b5329-d970-49ff-945a-8d7ce9ead90d"
level=info time=2024-09-26T20:15:08Z msg="[INFO] Black hole port fault is not running" command="nsenter --net=/host/proc/19943/ns/net iptables -C egress-tcp-1234 -p tcp --dport 1234 -j DROP" output="iptables: Bad rule (does a matching rule exist in that chain?).\n" taskArn="arn:aws:ecs:us-west-2:113424923516:task/default/80e14d09d9584860b984248e6ca0daa2" exitCode=1 netns="/host/proc/19943/ns/net"
level=info time=2024-09-26T20:15:08Z msg="[INFO] Attempting to start network black hole port fault" netns="/host/proc/19943/ns/net" chain="egress-tcp-1234" taskArn="arn:aws:ecs:us-west-2:113424923516:task/default/80e14d09d9584860b984248e6ca0daa2"
level=info time=2024-09-26T20:15:08Z msg="Successfully started fault" requestType="start network-blackhole-port" request="{\"Port\":1234,\"Protocol\":\"tcp\",\"TrafficType\":\"egress\"}" response="{\"Status\":\"running\"}"
level=debug time=2024-09-26T20:15:12Z msg="Storage stats not reported for container" module=utils_unix.go
level=debug time=2024-09-26T20:15:18Z msg="Handling http request" method="POST" from="169.254.172.2:37178"
level=info time=2024-09-26T20:15:18Z msg="Received new request for request type: check status network-blackhole-port" request="{\"Protocol\":\"tcp\",\"TrafficType\":\"egress\",\"Port\":1234}" requestType="check status network-blackhole-port" tmdsEndpointContainerID="962b5329-d970-49ff-945a-8d7ce9ead90d"
level=debug time=2024-09-26T20:15:18Z msg="Successfully parsed fault request payload" request="{\"Port\":1234,\"Protocol\":\"tcp\",\"TrafficType\":\"egress\"}"
level=info time=2024-09-26T20:15:18Z msg="[INFO] Black hole port fault has been found running" taskArn="arn:aws:ecs:us-west-2:113424923516:task/default/80e14d09d9584860b984248e6ca0daa2" netns="/host/proc/19943/ns/net" command="nsenter --net=/host/proc/19943/ns/net iptables -C egress-tcp-1234 -p tcp --dport 1234 -j DROP" output=""
level=info time=2024-09-26T20:15:18Z msg="Successfully check status fault" response="{\"Status\":\"running\"}" requestType="check status network-blackhole-port" request="{\"Port\":1234,\"Protocol\":\"tcp\",\"TrafficType\":\"egress\"}"
level=debug time=2024-09-26T20:15:28Z msg="Handling http request" method="POST" from="169.254.172.2:42884"
level=info time=2024-09-26T20:15:28Z msg="Received new request for request type: stop network-blackhole-port" request="{\"Protocol\":\"tcp\",\"TrafficType\":\"egress\",\"Port\":1234}" requestType="stop network-blackhole-port" tmdsEndpointContainerID="962b5329-d970-49ff-945a-8d7ce9ead90d"
level=debug time=2024-09-26T20:15:28Z msg="Successfully parsed fault request payload" request="{\"Port\":1234,\"Protocol\":\"tcp\",\"TrafficType\":\"egress\"}"
level=info time=2024-09-26T20:15:28Z msg="[INFO] Black hole port fault has been found running" taskArn="arn:aws:ecs:us-west-2:113424923516:task/default/80e14d09d9584860b984248e6ca0daa2" netns="/host/proc/19943/ns/net" command="nsenter --net=/host/proc/19943/ns/net iptables -C egress-tcp-1234 -p tcp --dport 1234 -j DROP" output=""
level=info time=2024-09-26T20:15:28Z msg="[INFO] Attempting to stop network black hole port fault" netns="/host/proc/19943/ns/net" chain="egress-tcp-1234" taskArn="arn:aws:ecs:us-west-2:113424923516:task/default/80e14d09d9584860b984248e6ca0daa2"
level=info time=2024-09-26T20:15:28Z msg="Successfully stopped fault" requestType="stop network-blackhole-port" request="{\"Port\":1234,\"Protocol\":\"tcp\",\"TrafficType\":\"egress\"}" response="{\"Status\":\"stopped\"}"
Summary
This PR will add a new method for the fault handlers to obtain the task network config as well as with the default network interface name. We will also now have a dedicated function to obtain the task metadata with the task network configurations set/initialized.
Implementation details
We will be introducing a new function called
GetTaskMetadataWithTaskNetworkConfig()
which will be responsible for obtaining the task response with the task network configuration of tasks. If the task is in host mode, we try to obtain the default network interface name on the host network namespace by consuming the changes introduced in https://github.com/aws/amazon-ecs-agent/pull/4342.GetTaskMetadataWithTaskNetworkConfig()
: This function takes in a container ID and aNetworkConfigClient
object and will obtain theTaskResponse
of the corresponding container ID with theTaskNetworkConfig
set/initialized. If the task is running on host mode and the OS platform is linux, we will be get the default network interface name on the host namespace viaDefaultNetInterfaceName()
.getTaskMetadata()
to now take in a new boolean parameter calledincludeTaskNetworkConfig
and will only create/add a newTaskNetworkConfig
object within the returnedTaskResponse
object if it's set to trueGetTaskMetadataWithTaskNetworkConfig()
withinvalidateTaskMetadata()
to get the task metadata to be used for starting/stopping/checking status of faultsNetworkConfigClient
struct introduced in theecs-agent/tmds/utils/netconfig/
package. For linux, this struct will store anetlinkwrapper.NetLink
object which is used to get the default network interface name on the host.Testing
Moved
TestV4GetTaskMetadataWithTaskNetworkConfig
tests into OS/platform specific files and now consuming the newGetTaskMetadataWithTaskNetworkConfig
functionality.New tests cover the changes: yes
Manual testing
Started a host mode task and tried starting/checking status/stopping a BHP:
Iptables of the Host mode on task
Started a AWSVPC mode task and tried starting/checking status/stopping a BHP:
Iptables of the AWSVPC task:
Description for the changelog
Feature: Obtain default network interface name on the host within task metadata
Additional Information
Does this PR include breaking model changes? If so, Have you added transformation functions?
**Does this PR include the addition of new environment variables in the README?**Licensing
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.