aws / amazon-ecs-agent

Amazon Elastic Container Service Agent
http://aws.amazon.com/ecs/
Apache License 2.0
2.08k stars 616 forks source link

Obtain default network interface name within task metadata state #4367

Closed mye956 closed 2 months ago

mye956 commented 2 months ago

Summary

This PR will add a new method for the fault handlers to obtain the task network config as well as with the default network interface name. We will also now have a dedicated function to obtain the task metadata with the task network configurations set/initialized.

Implementation details

We will be introducing a new function called GetTaskMetadataWithTaskNetworkConfig() which will be responsible for obtaining the task response with the task network configuration of tasks. If the task is in host mode, we try to obtain the default network interface name on the host network namespace by consuming the changes introduced in https://github.com/aws/amazon-ecs-agent/pull/4342.

Testing

Moved TestV4GetTaskMetadataWithTaskNetworkConfig tests into OS/platform specific files and now consuming the new GetTaskMetadataWithTaskNetworkConfig functionality.

New tests cover the changes: yes

Manual testing

Started a host mode task and tried starting/checking status/stopping a BHP:

level=debug time=2024-09-26T20:13:03Z msg="Handling http request" method="POST" from="172.31.25.237:53534"
level=info time=2024-09-26T20:13:03Z msg="Received new request for request type: start network-blackhole-port" request="{\"Protocol\":\"tcp\",\"TrafficType\":\"egress\",\"Port\":1234}" requestType="start network-blackhole-port" tmdsEndpointContainerID="03da5270-90d9-4571-bf16-2dd86d87e9f6"
level=debug time=2024-09-26T20:13:03Z msg="Found route" Route={Ifindex: 2 Dst: <nil> Src: <nil> Gw: 172.31.16.1 Flags: [] Table: 254 Realm: 0}
level=debug time=2024-09-26T20:13:03Z msg="Found the associated network interface by the index" LinkName="eth0" LinkIndex=2
level=info time=2024-09-26T20:13:03Z msg="Obtained the default network interface name on host" taskARN="arn:aws:ecs:us-west-2:113424923516:task/default/12d38370cc4448fca63e3657891d8f7f" defaultDeviceName="eth0"
level=info time=2024-09-26T20:13:03Z msg="[INFO] Black hole port fault is not running" command="iptables -C egress-tcp-1234 -p tcp --dport 1234 -j DROP" output="iptables: Bad rule (does a matching rule exist in that chain?).\n" taskArn="arn:aws:ecs:us-west-2:113424923516:task/default/12d38370cc4448fca63e3657891d8f7f" exitCode=1 netns="host"
level=info time=2024-09-26T20:13:03Z msg="[INFO] Attempting to start network black hole port fault" chain="egress-tcp-1234" taskArn="arn:aws:ecs:us-west-2:113424923516:task/default/12d38370cc4448fca63e3657891d8f7f" netns="host"
level=info time=2024-09-26T20:13:03Z msg="Successfully started fault" requestType="start network-blackhole-port" request="{\"Port\":1234,\"Protocol\":\"tcp\",\"TrafficType\":\"egress\"}" response="{\"Status\":\"running\"}"
level=debug time=2024-09-26T20:13:13Z msg="Handling http request" method="POST" from="172.31.25.237:43658"
level=info time=2024-09-26T20:13:13Z msg="Received new request for request type: check status network-blackhole-port" request="{\"Protocol\":\"tcp\",\"TrafficType\":\"egress\",\"Port\":1234}" requestType="check status network-blackhole-port" tmdsEndpointContainerID="03da5270-90d9-4571-bf16-2dd86d87e9f6"
level=debug time=2024-09-26T20:13:13Z msg="Successfully parsed fault request payload" request="{\"Port\":1234,\"Protocol\":\"tcp\",\"TrafficType\":\"egress\"}"
level=debug time=2024-09-26T20:13:13Z msg="Found route" Route={Ifindex: 2 Dst: <nil> Src: <nil> Gw: 172.31.16.1 Flags: [] Table: 254 Realm: 0}
level=debug time=2024-09-26T20:13:13Z msg="Found the associated network interface by the index" LinkName="eth0" LinkIndex=2
level=info time=2024-09-26T20:13:13Z msg="Obtained the default network interface name on host" taskARN="arn:aws:ecs:us-west-2:113424923516:task/default/12d38370cc4448fca63e3657891d8f7f" defaultDeviceName="eth0"
level=info time=2024-09-26T20:13:13Z msg="[INFO] Black hole port fault has been found running" command="iptables -C egress-tcp-1234 -p tcp --dport 1234 -j DROP" output="" taskArn="arn:aws:ecs:us-west-2:113424923516:task/default/12d38370cc4448fca63e3657891d8f7f" netns="host"
level=info time=2024-09-26T20:13:13Z msg="Successfully check status fault" requestType="check status network-blackhole-port" request="{\"Port\":1234,\"Protocol\":\"tcp\",\"TrafficType\":\"egress\"}" response="{\"Status\":\"running\"}"
level=debug time=2024-09-26T20:13:23Z msg="Handling http request" method="POST" from="172.31.25.237:42884"
level=info time=2024-09-26T20:13:23Z msg="Received new request for request type: stop network-blackhole-port" request="{\"Protocol\":\"tcp\",\"TrafficType\":\"egress\",\"Port\":1234}" requestType="stop network-blackhole-port" tmdsEndpointContainerID="03da5270-90d9-4571-bf16-2dd86d87e9f6"
level=debug time=2024-09-26T20:13:23Z msg="Successfully parsed fault request payload" request="{\"Port\":1234,\"Protocol\":\"tcp\",\"TrafficType\":\"egress\"}"
level=debug time=2024-09-26T20:13:23Z msg="Found route" Route={Ifindex: 2 Dst: <nil> Src: <nil> Gw: 172.31.16.1 Flags: [] Table: 254 Realm: 0}
level=debug time=2024-09-26T20:13:23Z msg="Found the associated network interface by the index" LinkName="eth0" LinkIndex=2
level=info time=2024-09-26T20:13:23Z msg="Obtained the default network interface name on host" taskARN="arn:aws:ecs:us-west-2:113424923516:task/default/12d38370cc4448fca63e3657891d8f7f" defaultDeviceName="eth0"
level=info time=2024-09-26T20:13:23Z msg="[INFO] Black hole port fault has been found running" netns="host" command="iptables -C egress-tcp-1234 -p tcp --dport 1234 -j DROP" output="" taskArn="arn:aws:ecs:us-west-2:113424923516:task/default/12d38370cc4448fca63e3657891d8f7f"
level=info time=2024-09-26T20:13:23Z msg="[INFO] Attempting to stop network black hole port fault" netns="host" chain="egress-tcp-1234" taskArn="arn:aws:ecs:us-west-2:113424923516:task/default/12d38370cc4448fca63e3657891d8f7f"
level=info time=2024-09-26T20:13:23Z msg="Successfully stopped fault" requestType="stop network-blackhole-port" request="{\"Port\":1234,\"Protocol\":\"tcp\",\"TrafficType\":\"egress\"}" response="{\"Status\":\"stopped\"}"

Iptables of the Host mode on task

[ec2-user@ip-172-31-25-237 amazon-ecs-agent]$ sudo iptables -nL
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         
DROP       tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:51678
DROP       all  -- !127.0.0.0/8          127.0.0.0/8          ! ctstate RELATED,ESTABLISHED,DNAT

Chain FORWARD (policy DROP)
target     prot opt source               destination         
DOCKER-USER  all  --  0.0.0.0/0            0.0.0.0/0           
DOCKER-ISOLATION-STAGE-1  all  --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
DOCKER     all  --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         

Chain DOCKER (1 references)
target     prot opt source               destination         

Chain DOCKER-ISOLATION-STAGE-1 (1 references)
target     prot opt source               destination         
DOCKER-ISOLATION-STAGE-2  all  --  0.0.0.0/0            0.0.0.0/0           
RETURN     all  --  0.0.0.0/0            0.0.0.0/0           

Chain DOCKER-ISOLATION-STAGE-2 (1 references)
target     prot opt source               destination         
DROP       all  --  0.0.0.0/0            0.0.0.0/0           
RETURN     all  --  0.0.0.0/0            0.0.0.0/0           

Chain DOCKER-USER (1 references)
target     prot opt source               destination         
RETURN     all  --  0.0.0.0/0            0.0.0.0/0           
[ec2-user@ip-172-31-25-237 amazon-ecs-agent]$ sudo iptables -nL
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         
DROP       tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:51678
DROP       all  -- !127.0.0.0/8          127.0.0.0/8          ! ctstate RELATED,ESTABLISHED,DNAT

Chain FORWARD (policy DROP)
target     prot opt source               destination         
DOCKER-USER  all  --  0.0.0.0/0            0.0.0.0/0           
DOCKER-ISOLATION-STAGE-1  all  --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
DOCKER     all  --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         
egress-tcp-1234  all  --  0.0.0.0/0            0.0.0.0/0           

Chain DOCKER (1 references)
target     prot opt source               destination         

Chain DOCKER-ISOLATION-STAGE-1 (1 references)
target     prot opt source               destination         
DOCKER-ISOLATION-STAGE-2  all  --  0.0.0.0/0            0.0.0.0/0           
RETURN     all  --  0.0.0.0/0            0.0.0.0/0           

Chain DOCKER-ISOLATION-STAGE-2 (1 references)
target     prot opt source               destination         
DROP       all  --  0.0.0.0/0            0.0.0.0/0           
RETURN     all  --  0.0.0.0/0            0.0.0.0/0           

Chain DOCKER-USER (1 references)
target     prot opt source               destination         
RETURN     all  --  0.0.0.0/0            0.0.0.0/0           

Chain egress-tcp-1234 (1 references)
target     prot opt source               destination         
DROP       tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:1234
[ec2-user@ip-172-31-25-237 amazon-ecs-agent]$ sudo iptables -nL
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         
DROP       tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:51678
DROP       all  -- !127.0.0.0/8          127.0.0.0/8          ! ctstate RELATED,ESTABLISHED,DNAT

Chain FORWARD (policy DROP)
target     prot opt source               destination         
DOCKER-USER  all  --  0.0.0.0/0            0.0.0.0/0           
DOCKER-ISOLATION-STAGE-1  all  --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
DOCKER     all  --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         

Chain DOCKER (1 references)
target     prot opt source               destination         

Chain DOCKER-ISOLATION-STAGE-1 (1 references)
target     prot opt source               destination         
DOCKER-ISOLATION-STAGE-2  all  --  0.0.0.0/0            0.0.0.0/0           
RETURN     all  --  0.0.0.0/0            0.0.0.0/0           

Chain DOCKER-ISOLATION-STAGE-2 (1 references)
target     prot opt source               destination         
DROP       all  --  0.0.0.0/0            0.0.0.0/0           
RETURN     all  --  0.0.0.0/0            0.0.0.0/0           

Chain DOCKER-USER (1 references)
target     prot opt source               destination         
RETURN     all  --  0.0.0.0/0            0.0.0.0/0           

Started a AWSVPC mode task and tried starting/checking status/stopping a BHP:

level=debug time=2024-09-26T20:15:08Z msg="Handling http request" method="POST" from="169.254.172.2:40238"
level=info time=2024-09-26T20:15:08Z msg="Received new request for request type: start network-blackhole-port" request="{\"Protocol\":\"tcp\",\"TrafficType\":\"egress\",\"Port\":1234}" requestType="start network-blackhole-port" tmdsEndpointContainerID="962b5329-d970-49ff-945a-8d7ce9ead90d"
level=info time=2024-09-26T20:15:08Z msg="[INFO] Black hole port fault is not running" command="nsenter --net=/host/proc/19943/ns/net iptables -C egress-tcp-1234 -p tcp --dport 1234 -j DROP" output="iptables: Bad rule (does a matching rule exist in that chain?).\n" taskArn="arn:aws:ecs:us-west-2:113424923516:task/default/80e14d09d9584860b984248e6ca0daa2" exitCode=1 netns="/host/proc/19943/ns/net"
level=info time=2024-09-26T20:15:08Z msg="[INFO] Attempting to start network black hole port fault" netns="/host/proc/19943/ns/net" chain="egress-tcp-1234" taskArn="arn:aws:ecs:us-west-2:113424923516:task/default/80e14d09d9584860b984248e6ca0daa2"
level=info time=2024-09-26T20:15:08Z msg="Successfully started fault" requestType="start network-blackhole-port" request="{\"Port\":1234,\"Protocol\":\"tcp\",\"TrafficType\":\"egress\"}" response="{\"Status\":\"running\"}"
level=debug time=2024-09-26T20:15:12Z msg="Storage stats not reported for container" module=utils_unix.go
level=debug time=2024-09-26T20:15:18Z msg="Handling http request" method="POST" from="169.254.172.2:37178"
level=info time=2024-09-26T20:15:18Z msg="Received new request for request type: check status network-blackhole-port" request="{\"Protocol\":\"tcp\",\"TrafficType\":\"egress\",\"Port\":1234}" requestType="check status network-blackhole-port" tmdsEndpointContainerID="962b5329-d970-49ff-945a-8d7ce9ead90d"
level=debug time=2024-09-26T20:15:18Z msg="Successfully parsed fault request payload" request="{\"Port\":1234,\"Protocol\":\"tcp\",\"TrafficType\":\"egress\"}"
level=info time=2024-09-26T20:15:18Z msg="[INFO] Black hole port fault has been found running" taskArn="arn:aws:ecs:us-west-2:113424923516:task/default/80e14d09d9584860b984248e6ca0daa2" netns="/host/proc/19943/ns/net" command="nsenter --net=/host/proc/19943/ns/net iptables -C egress-tcp-1234 -p tcp --dport 1234 -j DROP" output=""
level=info time=2024-09-26T20:15:18Z msg="Successfully check status fault" response="{\"Status\":\"running\"}" requestType="check status network-blackhole-port" request="{\"Port\":1234,\"Protocol\":\"tcp\",\"TrafficType\":\"egress\"}"
level=debug time=2024-09-26T20:15:28Z msg="Handling http request" method="POST" from="169.254.172.2:42884"
level=info time=2024-09-26T20:15:28Z msg="Received new request for request type: stop network-blackhole-port" request="{\"Protocol\":\"tcp\",\"TrafficType\":\"egress\",\"Port\":1234}" requestType="stop network-blackhole-port" tmdsEndpointContainerID="962b5329-d970-49ff-945a-8d7ce9ead90d"
level=debug time=2024-09-26T20:15:28Z msg="Successfully parsed fault request payload" request="{\"Port\":1234,\"Protocol\":\"tcp\",\"TrafficType\":\"egress\"}"
level=info time=2024-09-26T20:15:28Z msg="[INFO] Black hole port fault has been found running" taskArn="arn:aws:ecs:us-west-2:113424923516:task/default/80e14d09d9584860b984248e6ca0daa2" netns="/host/proc/19943/ns/net" command="nsenter --net=/host/proc/19943/ns/net iptables -C egress-tcp-1234 -p tcp --dport 1234 -j DROP" output=""
level=info time=2024-09-26T20:15:28Z msg="[INFO] Attempting to stop network black hole port fault" netns="/host/proc/19943/ns/net" chain="egress-tcp-1234" taskArn="arn:aws:ecs:us-west-2:113424923516:task/default/80e14d09d9584860b984248e6ca0daa2"
level=info time=2024-09-26T20:15:28Z msg="Successfully stopped fault" requestType="stop network-blackhole-port" request="{\"Port\":1234,\"Protocol\":\"tcp\",\"TrafficType\":\"egress\"}" response="{\"Status\":\"stopped\"}"

Iptables of the AWSVPC task:

[ec2-user@ip-172-31-25-237 amazon-ecs-agent]$ sudo nsenter --net=/proc/19943/ns/net iptables -nL
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         
egress-tcp-1234  all  --  0.0.0.0/0            0.0.0.0/0           

Chain egress-tcp-1234 (1 references)
target     prot opt source               destination         
DROP       tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:1234
[ec2-user@ip-172-31-25-237 amazon-ecs-agent]$ sudo nsenter --net=/proc/19943/ns/net iptables -nL
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination

Description for the changelog

Feature: Obtain default network interface name on the host within task metadata

Additional Information

Does this PR include breaking model changes? If so, Have you added transformation functions?

**Does this PR include the addition of new environment variables in the README?**

Licensing

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.