keikoproj / lifecycle-manager

Graceful AWS scaling event on Kubernetes using lifecycle hooks
Apache License 2.0
92 stars 28 forks source link

Run image as lifecycle-manager user #89

Closed avestuk closed 1 year ago

avestuk commented 1 year ago

Docker containers run as root if no other user is specified. Building on the strong security posture of the scratch base we can run as a the unprivileged user "lifecycle-manager" to further improve the security posture.

Fixes #88

shrinandj commented 1 year ago

Can you some details of the testing done with this change? Were you able to create a new image and run it in a cluster to ensure that the lifecycle-manager functionality worked as expected with this change?

codecov[bot] commented 1 year ago

Codecov Report

Merging #89 (8355547) into master (de2f378) will not change coverage. The diff coverage is n/a.

@@           Coverage Diff           @@
##           master      #89   +/-   ##
=======================================
  Coverage   70.74%   70.74%           
=======================================
  Files          12       12           
  Lines        1234     1234           
=======================================
  Hits          873      873           
  Misses        298      298           
  Partials       63       63           

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

avestuk commented 1 year ago

@shrinandj It's essentially just a case of making sure that the binary can run. The metrics are served on port 8080 so there's no privileges required for that. I'm running the image in my cluster now, so I can report back after the scaling activity overnight but I can't forsee any issues.

shrinandj commented 1 year ago

Sounds great!

I'll let @kevdowney and/or @tekenstam make the final call about if/when to merge this.

avestuk commented 1 year ago

Ran lifecycle-manager using a container built from this branch in my cluster overnight and it behaved as expected.

time="2023-05-05T06:48:03Z" level=info msg="starting lifecycle-manager service v0.5.1"
38
time="2023-05-05T06:48:03Z" level=info msg="region = eu-west-1"
37
time="2023-05-05T06:48:03Z" level=info msg="queue = lifecycle-manager-dev-blue"
36
time="2023-05-05T06:48:03Z" level=info msg="polling interval seconds = 10"
35
time="2023-05-05T06:48:03Z" level=info msg="max time to process seconds = 3600"
34
time="2023-05-05T06:48:03Z" level=info msg="node drain timeout seconds = 300"
33
time="2023-05-05T06:48:03Z" level=info msg="unknown node drain timeout seconds = 30"
32
time="2023-05-05T06:48:03Z" level=info msg="node drain retry interval seconds = 30"
31
time="2023-05-05T06:48:03Z" level=info msg="with alb deregister = true"
30
time="2023-05-05T06:48:03Z" level=info msg="starting metrics server on /metrics:8080"
29
time="2023-05-05T06:48:22Z" level=info msg="i-09c761014ad3cf509> received termination event"
28
time="2023-05-05T06:48:22Z" level=info msg="i-09c761014ad3cf509> sending heartbeat (1/24)"
27
time="2023-05-05T06:48:31Z" level=info msg="i-09c761014ad3cf509> draining node/ip-172-22-129-210.eu-west-1.compute.internal"
26
time="2023-05-05T06:48:40Z" level=info msg="i-09c761014ad3cf509> completed drain for node/ip-172-22-129-210.eu-west-1.compute.internal"
25
time="2023-05-05T06:48:40Z" level=info msg="i-09c761014ad3cf509> starting load balancer drain worker"
24
time="2023-05-05T06:48:57Z" level=info msg="i-09c761014ad3cf509> scanner starting"
23
time="2023-05-05T06:48:57Z" level=info msg="i-09c761014ad3cf509> checking targetgroup/elb membership"
22
time="2023-05-05T06:49:13Z" level=info msg="i-09c761014ad3cf509> found 0 target groups & 4 classic-elb"
21
time="2023-05-05T06:49:27Z" level=info msg="i-09c761014ad3cf509> queuing deregistrator"
20
time="2023-05-05T06:49:27Z" level=info msg="deregistrator> no active targets for deregistration"
19
time="2023-05-05T06:49:27Z" level=info msg="i-09c761014ad3cf509> queuing waiters"
...
kevdowney commented 1 year ago

@avestuk Thanks for the contribution with test.

LGTM, we may in the future move to distroless container images.

avestuk commented 1 year ago

@kevdowney Do you know when a release with this change in it might happen?