netobserv / netobserv-ebpf-agent

Network Observability eBPF Agent
Apache License 2.0
116 stars 29 forks source link

NETOBSERV-1532: add TLS support to ebpf agent metrics config #305

Closed msherif1234 closed 3 months ago

msherif1234 commented 3 months ago

Description

Add the ability to use TLS for the metrics server,

Dependencies

n/a

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

openshift-ci-robot commented 3 months ago

@msherif1234: This pull request references NETOBSERV-1532 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set.

In response to [this](https://github.com/netobserv/netobserv-ebpf-agent/pull/305): >## Description >Add the ability to use TLS for the metrics server, > >## Dependencies > >n/a > >## Checklist > >If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that. > >* [ ] Will this change affect NetObserv / Network Observability operator? If not, you can ignore the rest of this checklist. >* [ ] Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix _(in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes)._ >* [ ] Does this PR require product documentation? > * [ ] If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs. >* [ ] Does this PR require a product release notes entry? > * [ ] If so, fill in "Release Note Text" in the JIRA. >* [ ] Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc. > * [ ] If so, make sure it is described in the JIRA ticket. >* QE requirements (check 1 from the list): > * [ ] Standard QE validation, with pre-merge tests unless stated otherwise. > * [ ] Regression tests only (e.g. refactoring with no user-facing change). > * [ ] No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team). > Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=netobserv%2Fnetobserv-ebpf-agent). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
codecov[bot] commented 3 months ago

Codecov Report

Attention: Patch coverage is 30.00000% with 7 lines in your changes are missing coverage. Please review.

Project coverage is 34.01%. Comparing base (3a12ba2) to head (aa041de).

Files Patch % Lines
pkg/agent/agent.go 0.00% 4 Missing :warning:
pkg/prometheus/prom_server.go 50.00% 1 Missing and 2 partials :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #305 +/- ## ========================================== - Coverage 34.04% 34.01% -0.03% ========================================== Files 47 47 Lines 3836 3845 +9 ========================================== + Hits 1306 1308 +2 - Misses 2444 2449 +5 - Partials 86 88 +2 ``` | [Flag](https://app.codecov.io/gh/netobserv/netobserv-ebpf-agent/pull/305/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=netobserv) | Coverage Δ | | |---|---|---| | [unittests](https://app.codecov.io/gh/netobserv/netobserv-ebpf-agent/pull/305/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=netobserv) | `34.01% <30.00%> (-0.03%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=netobserv#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

github-actions[bot] commented 3 months ago

New image: quay.io/netobserv/netobserv-ebpf-agent:edd4cb8

It will expire after two weeks.

To deploy this build, run from the operator repo, assuming the operator is running:

USER=netobserv VERSION=edd4cb8 make set-agent-image
msherif1234 commented 3 months ago

/ok-to-test

github-actions[bot] commented 3 months ago

New image: quay.io/netobserv/netobserv-ebpf-agent:b466e4a

It will expire after two weeks.

To deploy this build, run from the operator repo, assuming the operator is running:

USER=netobserv VERSION=b466e4a make set-agent-image
memodi commented 3 months ago

@msherif1234 - I tried to enable metrics with TLS with below config in flowcollector:

      metrics:
        enable: true
        server:
          port: 9090
          tls:
            insecureSkipVerify: false
            type: Auto

ebpf pods are landing in error state:

time="2024-03-26T15:28:50Z" level=info msg="starting NetObserv eBPF Agent"
time="2024-03-26T15:28:50Z" level=info msg="initializing Flows agent" component=agent.Flows
time="2024-03-26T15:28:50Z" level=info msg="StartServerAsync: addr = :9090" component=prometheus
time="2024-03-26T15:28:50Z" level=info msg="push CTRL+C or send SIGTERM to interrupt execution"
time="2024-03-26T15:28:50Z" level=info msg="starting Flows agent" component=agent.Flows
time="2024-03-26T15:28:50Z" level=warning msg="can't detect any network-namespaces err: open /var/run/netns: no such file or directory [Ignore if the agent privileged flag is not set]" component=ifaces.Watcher
time="2024-03-26T15:28:50Z" level=warning msg="failed to add watcher to netns directory err: no such file or directory [Ignore if the agent privileged flag is not set]" component=ifaces.Watcher
time="2024-03-26T15:28:50Z" level=fatal msg="error in http.ListenAndServe: open tls.crt: no such file or directory" component=prometheus
msherif1234 commented 3 months ago

@msherif1234 - I tried to enable metrics with TLS with below config in flowcollector:

      metrics:
        enable: true
        server:
          port: 9090
          tls:
            insecureSkipVerify: false
            type: Auto

ebpf pods are landing in error state:

time="2024-03-26T15:28:50Z" level=info msg="starting NetObserv eBPF Agent"
time="2024-03-26T15:28:50Z" level=info msg="initializing Flows agent" component=agent.Flows
time="2024-03-26T15:28:50Z" level=info msg="StartServerAsync: addr = :9090" component=prometheus
time="2024-03-26T15:28:50Z" level=info msg="push CTRL+C or send SIGTERM to interrupt execution"
time="2024-03-26T15:28:50Z" level=info msg="starting Flows agent" component=agent.Flows
time="2024-03-26T15:28:50Z" level=warning msg="can't detect any network-namespaces err: open /var/run/netns: no such file or directory [Ignore if the agent privileged flag is not set]" component=ifaces.Watcher
time="2024-03-26T15:28:50Z" level=warning msg="failed to add watcher to netns directory err: no such file or directory [Ignore if the agent privileged flag is not set]" component=ifaces.Watcher
time="2024-03-26T15:28:50Z" level=fatal msg="error in http.ListenAndServe: open tls.crt: no such file or directory" component=prometheus

@memodi there was missing mounts in the operator side I just updated the operator PR to do the proper mounts

msherif1234 commented 3 months ago

/ok-to-test

github-actions[bot] commented 3 months ago

New image: quay.io/netobserv/netobserv-ebpf-agent:e418bc9

It will expire after two weeks.

To deploy this build, run from the operator repo, assuming the operator is running:

USER=netobserv VERSION=e418bc9 make set-agent-image
codecov-commenter commented 3 months ago

Codecov Report

Attention: Patch coverage is 25.00000% with 9 lines in your changes are missing coverage. Please review.

Project coverage is 33.84%. Comparing base (a5bcf49) to head (68f00d3).

Files Patch % Lines
pkg/agent/agent.go 0.00% 6 Missing :warning:
pkg/prometheus/prom_server.go 50.00% 1 Missing and 2 partials :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #305 +/- ## ========================================== - Coverage 34.04% 33.84% -0.21% ========================================== Files 47 47 Lines 3836 3847 +11 ========================================== - Hits 1306 1302 -4 - Misses 2444 2456 +12 - Partials 86 89 +3 ``` | [Flag](https://app.codecov.io/gh/netobserv/netobserv-ebpf-agent/pull/305/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=netobserv) | Coverage Δ | | |---|---|---| | [unittests](https://app.codecov.io/gh/netobserv/netobserv-ebpf-agent/pull/305/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=netobserv) | `33.84% <25.00%> (-0.21%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=netobserv#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

memodi commented 3 months ago

/label qe-approved

openshift-ci-robot commented 3 months ago

@msherif1234: This pull request references NETOBSERV-1532 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set.

In response to [this](https://github.com/netobserv/netobserv-ebpf-agent/pull/305): >## Description >Add the ability to use TLS for the metrics server, > >## Dependencies > >n/a > >## Checklist > >If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that. > >* [ ] Will this change affect NetObserv / Network Observability operator? If not, you can ignore the rest of this checklist. >* [ ] Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix _(in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes)._ >* [ ] Does this PR require product documentation? > * [ ] If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs. >* [ ] Does this PR require a product release notes entry? > * [ ] If so, fill in "Release Note Text" in the JIRA. >* [ ] Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc. > * [ ] If so, make sure it is described in the JIRA ticket. >* QE requirements (check 1 from the list): > * [ ] Standard QE validation, with pre-merge tests unless stated otherwise. > * [ ] Regression tests only (e.g. refactoring with no user-facing change). > * [ ] No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team). > Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=netobserv%2Fnetobserv-ebpf-agent). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
msherif1234 commented 3 months ago

/approve

openshift-ci[bot] commented 3 months ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: msherif1234

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/codeready-toolchain/toolchain-e2e/blob/main/OWNERS)~~ [msherif1234] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment