Closed JoshSalomon closed 6 months ago
@JoshSalomon: This pull request references MON-3707 which is a valid jira issue.
Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.16.0" version, but no target version was set.
/retest-required
/retest-required
IIUC the goal of the metric is to report 1 for any active combination of mode/legacy API. The following expression would be more suited:
group by(mode,is_legacy_api) (openshift_network_operator_ipsec_state{namespace=~"openshift-network-operator"})
At any point in time, there is only one active combination (the code resets the gauge and then calls
ipsecStateGauge.WithLabelValues
once), so I don't think there is a big difference here.Finally the metric name could be improved IMHO:
openshift:openshift_network_operator_ipsec_state:sum
stutters, I'd suggestcluster:openshift_network_operator_ipsec_state:info
.
I agree, (with both comments) - but is it critical, the CNO PR took a long time to approve, and we need this in 4.16 - are these comments really critical to put this feature in risk for 4.16?
/retest-required
/retest-required
/retest-required
/retest-required
/retest-required
/hold
/retest-required
/retest-required
/retest-required
At any point in time, there is only one active combination (the code resets the gauge and then calls
ipsecStateGauge.WithLabelValues
once), so I don't think there is a big difference here.
If the underlying code changes, a group by
will ensure the value safely remains a boolean, and won't be prone to going out of bounds.
This is obvious but just to be explicit, you'll also need to s/sum/info
in the source, otherwise this won't work. I'm not sure if there's a PR up for that in CNO.
PS. The tests are failing due to internal changes, and should be fixed soon.
At any point in time, there is only one active combination (the code resets the gauge and then calls
ipsecStateGauge.WithLabelValues
once), so I don't think there is a big difference here.If the underlying code changes, a
group by
will ensure the value safely remains a boolean, and won't be prone to going out of bounds.This is obvious but just to be explicit, you'll also need to
s/sum/info
in the source, otherwise this won't work. I'm not sure if there's a PR up for that in CNO.
PR is up in CNO: https://github.com/openshift/cluster-network-operator/pull/2346 it got lgtmed by networking team. @rexagod would you mind double check?
/retest-required
/unhold
the SDN PR https://github.com/openshift/cluster-network-operator/pull/2346 was merged
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: JoshSalomon, rexagod
The full list of commands accepted by this bot can be found here.
The pull request process is described here
/retest-required
Remaining retests: 0 against base HEAD 5af508b31380d4b7b1a562e942559aea49b0121b and 2 for PR HEAD 499bcca13800687dcecbddecc8c717f6a75a1c00 in total
@JoshSalomon: The following test failed, say /retest
to rerun all failed tests or /retest-required
to rerun all mandatory failed tests:
Test name | Commit | Details | Required | Rerun command |
---|---|---|---|---|
ci/prow/e2e-aws-ovn-single-node | 499bcca13800687dcecbddecc8c717f6a75a1c00 | link | false | /test e2e-aws-ovn-single-node |
Full PR test history. Your PR dashboard.
/retest-required
Remaining retests: 0 against base HEAD 98a17212947fd12f78c6d8e6d1d45775a692eae1 and 1 for PR HEAD 499bcca13800687dcecbddecc8c717f6a75a1c00 in total
/retest-required
[ART PR BUILD NOTIFIER]
This PR has been included in build cluster-monitoring-operator-container-v4.16.0-202404222343.p0.gbbde8c3.assembly.stream.el9 for distgit cluster-monitoring-operator. All builds following this will include this PR.
/cherrypick release-4.15
@JoshSalomon: #2326 failed to apply on top of branch "release-4.15":
Applying: Add ipsec state metric into telemetry
Using index info to reconstruct a base tree...
M Documentation/data-collection.md
M Documentation/sample-metrics.md
M Documentation/telemetry/telemeter_query
M manifests/0000_50_cluster-monitoring-operator_04-config.yaml
Falling back to patching base and 3-way merge...
Auto-merging manifests/0000_50_cluster-monitoring-operator_04-config.yaml
Auto-merging Documentation/telemetry/telemeter_query
CONFLICT (content): Merge conflict in Documentation/telemetry/telemeter_query
Auto-merging Documentation/sample-metrics.md
CONFLICT (content): Merge conflict in Documentation/sample-metrics.md
Auto-merging Documentation/data-collection.md
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 Add ipsec state metric into telemetry
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".
Add the metric openshift:openshift_network_operator_ipsec_state:sum to telemetry This metric captures ipsec state of the cluster (Disabled, External or Full) and whether the state was set by the legacy API (OCP 4.14 or before) or the new API (OCP 4.15+)