aporeto-inc / trireme-lib

Simple, scalable and secure application segmentation
https://trireme.io
Apache License 2.0
299 stars 51 forks source link

Changing Kubernetes monitor startup behaviour #981

Closed mheese closed 4 years ago

mheese commented 4 years ago

The BCGov has seen fatal errors during the enforcer startup like this:

{"l":"fatal","t":1583278316.241352,"c":"run/run.go:570","m":"Unable to start monitors","error":"pod: controller did not start within 5s"}

This will change the behaviour to just log a warning and waiting for the controllers to start. This will block indefinitely until the controllers have started.

Addresses https://github.com/aporeto-inc/aporeto/issues/2692

codecov[bot] commented 4 years ago

Codecov Report

Merging #981 into master will increase coverage by 0.5%. The diff coverage is 82.14%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master     #981     +/-   ##
=========================================
+ Coverage   54.28%   54.78%   +0.5%     
=========================================
  Files         124      124             
  Lines       11971    11980      +9     
=========================================
+ Hits         6498     6563     +65     
+ Misses       4852     4788     -64     
- Partials      621      629      +8
Impacted Files Coverage Δ
monitor/internal/pod/monitor.go 57.27% <82.14%> (+57.27%) :arrow_up:
monitor/internal/pod/resync.go 77.02% <0%> (+2.7%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 197e1e3...18ac0a8. Read the comment docs.

mheese commented 4 years ago

/build - automatically fired by gogo with following PRs and commit SHAs v1.0.0

[
  {
    "project": "k8s-startup",
    "component": "enforcerd",
    "pr-id": "1601",
    "commit-sha": "5bc912993451feb292b1ff0daa01878911ac68b4",
    "pipeline": "master"
  },
  {
    "project": "k8s-startup",
    "component": "trireme-lib",
    "pr-id": "981",
    "commit-sha": "18ac0a8851bc87c7669b1004a23d6e21d220a5cd",
    "pipeline": "master"
  }
]
mheese commented 4 years ago

going to merge this now. It turns out that all the previous FT runs have been failing because of this issue here: https://ci.aporeto.io/teams/main/pipelines/pr-tests/jobs/functional-tests/builds/2932

As this is unrelated to the change, I'm going to merge this now.