Adding implementation for StopNetworkPacketLoss() and CheckNetworkPacketLoss(), as well as corresponding unit tests.
Implementation details
Check packet loss implementation has already been introduced in the start-network-packet-loss API implementation. We're only adding the helper methods to the API call.
For stop packet loss, we first check whether there already exists a network packet loss fault. If yes, we will run the following command to stop it:
tc qdisc del dev <interfaceName> parent 1:1 handle 10:
tc filter del dev <interfaceName> prio 1
tc qdisc del dev <interfaceName> root handle 1: prio
Testing
Unit tests for the TMDS package was run.
% go test -tags unit -v -run TestCheckNetworkPacketLoss /workplace/tianzes/amazon-ecs-agent/ecs-agent/tmds/handlers/fault/v1/handlers
...
--- PASS: TestCheckNetworkPacketLoss (0.00s)
% go test -tags unit -v -run TestStopNetworkPacketLoss /workplace/tianzes/amazon-ecs-agent/ecs-agent/tmds/handlers/fault/v1/handlers
...
--- PASS: TestStopNetworkPacketLoss (0.00s)
New tests cover the changes:
Besides existing test cases for the handler, also added the following cases specifically for the start endpoint:
When there doesn't exist a fault on the instance
When there exists a packet loss fault
When there exists a latency fault
When the request contains an unknown field but the rest of the payload is proper.
Manual Testing
Now that we have implementation for all 3 packet loss APIs, we can test the complete workflow of start, check, and stop.
Launched a Fargate Instance with the changes. Launched a task with ecs-exec enabled.
# First curl the check packet loss endpoint. Since we haven't injected anything yet, response should be not running
% curl -X GET ${ECS_AGENT_URI}/fault/v1/network-packet-loss --data '{"lossPercent":6, "Sources":["192.168.0.1"]}'
{"Status":"not-running"}
# Now curl the start endpoint
% curl -X PUT ${ECS_AGENT_URI}/fault/v1/network-packet-loss --data '{"lossPercent":6, "Sources":["192.168.0.1", "10.1.1.1", "25.168.10.2"]}'
{"Status":"running"}
# Curl the check endpoint again to confirm that the fault was running. Note that the IP address in the check payload is different from the one in the start payload. This shouldn't matter because we won't use it. But if we agree that we won't need it we should take an AI to remove it from the check and stop payload.
% curl -X GET ${ECS_AGENT_URI}/fault/v1/network-packet-loss --data '{"lossPercent":6, "Sources":["192.168.0.1"]}'
{"Status":"running"}
# Curl the stop endpoint with arbitrary IP address
% curl -X DELETE ${ECS_AGENT_URI}/fault/v1/network-packet-loss --data '{"lossPercent":6, "Sources":["192.168.0.1"]}'
{"Status":"stopped"}
# Finally, curl the check endpoint again. The result should be not-running
% curl -X GET ${ECS_AGENT_URI}/fault/v1/network-packet-loss --data '{"lossPercent":6, "Sources":["192.168.0.1"]}'
{"Status":"not-running"}
Additional manual testing:
# Start a task, ecs-exec into the container, and inject 50% packet loss to 8.8.8.8
sh-5.2# curl -X PUT ${ECS_AGENT_URI}/fault/v1/network-packet-loss --data '{"lossPercent":50, "Sources":["8.8.8.8"]}'
# From the task container, ping 8.8.8.8, let it run for 30 seconds, and manually interrupt to see the stats.
sh-5.2# ping 8.8.8.8 -D
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
...
^C
--- 8.8.8.8 ping statistics ---
61 packets transmitted, 22 received, 63.9344% packet loss, time 60826ms
rtt min/avg/max/mdev = 7.932/8.042/8.892/0.189 ms
# We can see that the packet loss has been started as expected.
Description for the changelog
Add check and stop network packet loss implementation
Additional Information
Does this PR include breaking model changes? If so, Have you added transformation functions?
**Does this PR include the addition of new environment variables in the README?**
Licensing
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
Summary
Adding implementation for
StopNetworkPacketLoss()
andCheckNetworkPacketLoss()
, as well as corresponding unit tests.Implementation details
Check packet loss implementation has already been introduced in the start-network-packet-loss API implementation. We're only adding the helper methods to the API call.
For stop packet loss, we first check whether there already exists a network packet loss fault. If yes, we will run the following command to stop it:
Testing
Unit tests for the TMDS package was run.
New tests cover the changes: Besides existing test cases for the handler, also added the following cases specifically for the start endpoint:
Manual Testing
Now that we have implementation for all 3 packet loss APIs, we can test the complete workflow of start, check, and stop. Launched a Fargate Instance with the changes. Launched a task with ecs-exec enabled.
Additional manual testing:
Description for the changelog
Add check and stop network packet loss implementation
Additional Information
Does this PR include breaking model changes? If so, Have you added transformation functions?
**Does this PR include the addition of new environment variables in the README?**Licensing
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.